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NANBV DIAGNOSTICS AND VACCINES 


Technical Field 


The invention relates to materials and methodologies for managing the spread of non-A, non-B hepatitis 
virus (NANBV) infection* More specifically, it relates to polynucleotides derived from the genome of an 
etiologic agent of NANBH, hepatitis C virus (HCV). to polypeptides encoded therein, and to antibodies 
directed to the polypeptides. These reagents are useful as screening agents for HCV and its infection, and 
as protective agents against the disease. 
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Background Art 
5 

Non-A, Non-B hepatitis (NANBH) is a transmissible disease or family of diseases that are believed to be 
viral-induced, and that are distinguishable from other forms of viral-associated liver diseases, including that 
caused by the known hepatitis viruses, i.e., hepatitis A virus (HAV), hepatitis B virus (HBV). and delta 
hepatitis virus (HDV), as well as the hepatitis induced by cytomegalovirus (CMV) or Epstein-Barr virus 
io (EBV). NANBH was first identified in transfused individuals. Transmission from man to chimpanzee and 
serial passage in chimpanzees provided evidence that NANBH is due to a transmissible infectious agent or 
agents. 

Epidemiologic evidence is suggestive that there may be three types of NANBH: the water-borne 
epidemic type; the blood or needle associated type; and the sporadically occurring (community acquired) 
is type. However, the number of agents which may be the causative of NANBH are unknown. 

Clinical diagnosis and identification of NANBH has been accomplished primarily by exclusion of other 
viral markers. Among the methods used to detect putative NANBV antigens and antibodies are agar-gel 
diffusion, counterimmunoelectrophoresis, immunofluorescence microscopy, immune electron microscopy, 
radioimmunoassay, and enzyme-linked immunosorbent assay. However, none of these assays has proved 
20 to be sufficiently sensitive, specific, and reproducible to be used as a diagnostic test for NANBH. 

Previously there was neither clarity nor agreement as to the identity or specificity of the antigen 
antibody systems associated with agents of NANBH. This was due, at least in part, to the prior or co- 
infection of HBV with NANBV in individuals, and to the known complexity of the soluble and particulate 
antigens associated with HBV, as well as to the integration of HBV DNA into the genome of liver cells. In 
25 addition, there is the possibility that NANBH is caused by more than one infectious agent, as well as the 
possibility that NANBH has been misdiagnosed. Moreover, it is unclear what the serological assays detect 
in the serum of patients with NANBH. It has been postulated that the agar-gel diffusion and counterim- 
munoelectrophoresis assays detect autoimmune responses or nonspecific protein interactions that some- 
times occur between serum specimens, and that they do not represent specific NANBV antigen-antibody 
30 reactions. The immunofluorescence, and enzyme-linked immunosorbent, and radioimmunoassays appear to 
detect low levels of a rheumatoid-factor-like material that is frequently present in the serum of patients with 
NANBH as well as in patients with other hepatic and nonhepatic diseases. Some of the reactivity detected 
may represent antibody to host-determined cytoplasmic antigens. 

There have been a number of candidate NANBV. See, for example the reviews by Prince (1983), 
35 Feinstone and Hoofnagle (1984), and Overby (1985, 1986. 1987) and the article by Iwarson (1987). 
However, there is no proof that any of these candidates represent the etiological agent of NANBH. 

The demand for sensitive, specific methods for screening and identifying carriers of NANBV and 
NANBV contaminated blood or blood products is significant. Post-transfusion hepatitis (PTH) occurs in 
approximately 10% of transfused patients, and NANBH accounts for up to 90% of these cases. The major 
40 problem in this disease is the frequent progression to chronic liver damage (25-55%). 

Patient care as well as the prevention of transmission of NANBH by blood and blood products or by 
close personal contact require reliable screening, diagnostic and prognostic tools to detect nucleic acids, 
antigens and antibodies related to NANBV. In addition, there is also a need for effective vaccines and 
immunotherapeutic therapeutic agents for the prevention and/or treatment of the disease. 

45 Applicant discovered a new virus, the Hepatitis C virus (HCV), which has proven to be the major 
etiologic agent of blood-bome NANBH (BB-NANBH). Applicant’s initial work, including a partial genomic 
seq uence of the prototype HCV isolate, CDC/HCV1 (also called HCV1), is described in EPO Pub. No. 
31 8^2T6^pubBsh^~31^May^1 989)~and'PCT— Pub.-No._W_O__8?/0^6^^ublished^ 1 June 1989). The 
disclosures of these patent applications, as well as any corresponding national patent applications; - are 
so incorporated herein by reference. These applications teach, inter alia, recombinant DNA methods of cloning 
and expressing HCV sequences, HCV polypeptides, HCV immunodiagnostic techniques, HCV probe 
diagnostic techniques, anti-HCV antibodies, and methods of isolating new hCV sequences, including 
sequences of new HCV isolates. 


55 

Disclosure of the Invention 

Th present invention is based, in part, on new HCV sequences and polypeptides that ar not disclosed 
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in EPO Pub. No. 318,216, or in PCT Pub. No. WO 89/04669. Included within the invention is the application 
of these new sequences and polypeptides in, inter alia, immunodiagnostics, probe diagnostics, anti-HCV 
antibody production, PCR technology and recombinant DNA technology. Included within the invention, also, 
are new immunoassays based upon the immunogenicity of HCV polypeptides disclosed herein. The new 
s subject matter claimed herein, while developed using techniques described in, for example, EPO Pub. No. 
318,216, has a priority date which antecedes that publication, or any counterpart thereof. Thus, the invention 
provides novel compositions and methods useful for screening samples for HCV antigens and antibodies, 
and useful for treatment of HCV infections. 

Accordingly, one aspect of the invention is a recombinant polynucleotide comprising a sequence 
io derived from HCV cDNA, wherein the HCV cDNA is in clone 13i, or clone 26j, or clone 59a, or clone 84a, or 
clone CA156e, or clone 167b, or clone pi 14a, or clone CA216a, or clone CA290a, or clone ag30a, or clone 
205a, or clone 18g, or clone 16jh, or wherein the HCV cDNA is of a sequence indicated by nucleotide 
numbers -319 to 1348 or 8659 to 8866 in Fig. 17. 

Another aspect of the invention is a purified polypeptide comprising an epitope encoded within HCV 
is cDNA wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 
8866 in Fig. 17. 

Yet another aspect of the invention is an immunogenic polypeptide produced by a cell transformed with 
a recombinant expression vector comprising an ORF of DNA derived from HCV cDNA, wherein the HCV 
cDNA is comprised of a sequence derived from the HCV cDNA sequence in clone CA279a, or clone CA74a, 
20 or clone 13i, or clone CA290a, or clone 33C or clone 40b, or clone 33b, or clone 25c, or clone 14c, or clone 
8f, or clone 33f, or clone 33g, or clone 39c, or clone 15e, and wherein the ORF is operably linked to a 
control sequence compatible with a desired host. 

Another aspect of the invention is a peptide comprising an HCV epitope, wherein the peptide is of the 
formula 
25 AA x -AAy, 

wherein x and y designate amino acid numbers shown in Fig. 17, and wherein the peptide is selected from 
the group consisting of AA1-AA25, AA1-AA50, AA1-AA84, AA9-AA177, AA1-AA10, AA5-AA20, AA20-AA25, 
AA35-AA45, AA50-AA100, AA40-AA90, AA45-AA65, AA65-AA75, AA80-90. AA99-AA1 20, AA95-AA110, 
AA105-AA120. AA100-AA150, AA150-AA200, AA155-AA170, AA190-AA210. AA200-AA250, AA220-AA240, 
30 AA245-AA265, AA250-AA300, AA290-AA330, AA290-305, AA300-AA350. AA310-AA330, AA350-AA400, 

AA380-AA395, AA405-AA495. AA400-AA450, AA405-AA415, AA415-AA425, AA425-AA435, AA437-AA582, 
AA450-AA500, AA440-AA460, AA460-AA470. AA475-AA495, AA500-AA550, AA511-AA690, AA515-AA550, 
AA550-AA600, AA550-AA625, AA575-AA605, AA585-AA600, AA600-AA650, AA600-AA625, AA635-AA665, 
AA650-AA700, AA645-AA680, AA700-AA750. AA700-AA725, AA700-AA750, AA725-AA775, AA770-AA790, 
35 AA750-AA800, AA800-AA815, AA825-AA850. AA850-AA875. AA800-AA850, AA920-AA990, AA850-AA900, 

AA920-AA945, AA940-AA965, AA970-AA990, AA950-AA1000, AA1000-AA1060, AA1000-AA1025, AA1000- 
AA1050, AA 1 025- AA1 040, AA1040-AA1055, AA1075-AA1175, AA1050-AA1200. AA1070-AA1100, AA1100- 
AA1130. AA1140-AA1165, AA1192-AA1457, AA1195-AA1250, AA1 200-AA1 225, AA1225-AA1250, AA1250- 
AA1300, AA1260-AA1310, AA1260-AA1280, AA1266-AA1428, AA1 300-AA1 350, AA1290-AA1310. AA1310- 
40 AA1340, AA1345-AA1405, AA1345-AA1365, AA1350-AA1400. AA1365-AA1380, AA1380-AA1405, AA1400- 

AA1450, AA1450-AA1500, AA1460-AA1475. AA1475-AA1515, AA1 475-AA1 500, AA1500-AA1550. AA1500- 
AA1515, AA1515-AA1550, AA1550-AA1600, AA1 545-AA1 560, AA1569-AA1931, AA1570-AA1590, AA1595- 
AA1610, AA1 590-AA1650, AA1610-AA1645, AA1650-AA1690, AA1685-AA1770, AA1689-AA1805, AA1690- 
AA1720, AA1694-AA1735, AA1720-AA1745, AA1745-AA1770, AA1750-AA1800, AA1775-AA1810. AA1795- 
45 AA1850, AA1850-AA1900, AA1900-AA1950, AA1900-AA1920, AA1916-AA2021, AA1920-AA1940, AA1949- 

AA2124, AA1 950-AA2000. AA1950-AA1985, AA1980-AA2000, AA2000-AA2050, AA2005-AA2025. AA2020- 
AA2045, AA2045-AA21 00 , AA2045-AA2070, AA2054-AA2223, AA2070-AA2100, AA2100-AA2150, AA2150- 
AA2200, AA2200-AA2250, AA2200-AA2325, AA2250-AA2330, AA2255-AA2270, AA2265-AA2280, AA2280- 
AA2290, AA2287-AA2385, AA2300-AA2350, AA2290-AA2310, AA2310-AA2330, AA2330-AA2350, AA2350- 
50 AA2400, AA2348-AA2464. AA2345-AA241 5. AA2345-AA2375, AA2370-AA241 0. AA2371 -AA2502, AA2400- 

AA2450, AA2400- AA2425 , AA2415-AA2450, AA2445-AA2500, AA2445-AA2475, AA2470-AA2490. AA2500- 
AA2550, AA2505-AA2540 , AA2535-AA2560, AA2550-AA2600. AA2560-AA2580, AA2600-AA2650. AA2605- 
AA2620, AA2620-AA2650, AA2640-AA2660, AA2650-AA2700, AA2655-AA2670, AA2670-AA2700, AA2700- 
AA2750, AA2740- AA2760 , AA2750-AA2800, AA2755-AA2780, AA2780-AA2830, AA2785-AA281 0. AA2796- 
55 AA2886, AA281 0- AA2825 , AA2800-AA2850, AA2850-AA2900, AA2850-AA2865, AA2885-AA2905, AA2900- 

AA2950, AA291 0-AA2930, AA2925-AA2950, AA2945-end(C* terminal). 

Still another aspect of the invention is a monoclonal antibody directed against an epitope encoded in 
HCV cDNA, wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 1348 or 
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8659 to 8866 in Fig. 17, or is the sequence present in clone 13i, or clone 26j, or clone 59a, or clone 84a. or 
clone CA156©. or clone 167b, or clone pi 14a, or clone CA216a, or clone CA290a, or clone ag30a, or clone 
205a, or clone I8g, or clone 16jh. 

Yet another aspect of the invention is a preparation of purified polyclonal antibodies directed against a 
5 polypeptide comprised of an epitope encoded within HCV cDNA, wherein the HCV cDNA is of a sequence 
indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Rg. 17, or is the sequence present in in 
clone 13i, or clone 26j, or clone 59a, or clone 84a, or clone CA156e, or clone 167b, or clone pi14a, or clone 
CA216a, or clone CA290a, or clone ag30a, or clone 205a, or clone 18g, or clone I6jh. 

Still another aspect of the invention is a polynucleotide probe for HCV, wherein the probe is comprised 
to of an HCV sequence derived from an HCV cDNA sequence indicated by nucleotide numbers -319 to 1348 
or 8659 to 8866 in Rg. 17, or from the complement of the HCV cDNA sequence. 

Yet another aspect of the invention is a kit for analyzing samples for the presence of polynucleotides 
from HCV comprising a polynucleotide probe containing a nucleotide sequence of about 8 or more 
nucleotides, wherein the nucleotide sequence is derived from HCV cDNA which is of a sequence indicated 
/s by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17. wherein the polynucleotide probe is in a 
suitable container. 

Another aspect of the invention is a kit for analyzing samples for the presence of an HCV antigen 
comprising an antibody which reacts immunologically with an HCV antigen, wherein the antigen contains an 
epitope encoded within HCV cDNA which is of a sequence indicated by nucleotide numbers -319 to 1348 or 
20 8659 to 8866 in Rg. 17, or wherein the HCV cDNA is in clone 13i. or clone 26j, or clone 59a, or clone 84a, 

or clone CA156e, or clone 167b, or clone pi 14a, or clone CA216a, or clone CA290a. or clone ag30a, or 
clone 205a, or clone 18g, or clone 16jh. 

Yet another aspect of the invention is a kit for analyzing samples for the presence of an HCV antibody 
comprising an antigenic polypeptide containing an HCV epitope encoded within HCV cDNA which is of a 
25 sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Rg. 17, or is in clone 13i, or 
clone 26j, or clone 59a, or clone 84a, or clone CA156e, or clone 167b, or clone pi 14a, or clone CA216a, or 
clone CA290a, or clone ag30a, or clone 205a, or clone 18g, or clone 16jh. 

Another aspect of the invention is a kit for analyzing samples for the presence of an HCV antibody 
comprising an antigenic polypeptide expressed from HCV cDNA in clone CA279a, or clone CA74a, or clone 
so 13i, or clone CA290a, or clone 33C or clone 40b, or clone 33b, or clone 25c, or clone 14c, or clone 8f, or 
clone 33f, or clone 33g, or clone 39c, or clone 15e, wherein the antigenic polypeptide is present in a 

suitable container. 

Still another aspect of the invention is a method for detecting HCV nucleic acids in a sample 
comprising: 

35 (a) reacting nucleic acids of the sample with a polynucleotide probe for HCV, wherein the probe is 

comprised of an HCV sequence derived from an HCV cDNA sequence is of a sequence indicated by 
nucleotide numbers -319 to 1348 or 8659 to 8866 in Rg. 17, and wherein the reacting is under conditions 
which allow the formation of a polynucleotide duplex between the probe and the HCV nucleic acid from the 
sample; and (b) detecting a polynucleotide duplex which contains the probe, formed in step (a). 

40 Yet another aspect of the invention is an immunoassay for detecting an HCV antigen comprising: 

(a) incubating a sample suspected of containing an HCV antigen with an antibody directed against an HCV 
epitope encoded in HCV cDNA, wherein the HCV cDNA is of a sequence indicated by nucleotide numbers 
-319 to 1348 or 8659 to 8866 in Rg. 17, or is the sequence present in clone I3i, or clone 26j. or clone 59a, 
or clone 84a, or clone CAl56e, or clone 167b, or clone pi 14a, or clone CA216a, or clone CA290a, or clone 
45 ag30a, or clone 205a, or clone I8g, or clone I6jh, and wherein the incubating is under conditions which 
allow formation of an antigen-antibody complex; and (b) detecting an antibody-antigen complex formed in 
step (a) which contains the antibody. 

Still another aspect of theinvention isan immunoassay for detecting antibodjes di rected against a n 
HCV antigen comprising: 

so (a) incubating a sample suspected of containing anti- HCV antibodies with an antigen polypeptide containing 
an epitope encoded in HCV cDNA, wherein the HCV cDNA is of a sequence indicated by nucleotide 
numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is the sequence present in clone 13i, or clone 26j, or 
clone 59a, or clone 84a, or clone CA156e, or clone 167b, or cion pi14a, or clone CA216a, or clone 
CA290a. or clone ag30a, or cion 205a, or clone 18g, or clone 16jh, and wherein the incubating is under 
55 conditions which allow formation of an antigen-antibody complex; and detecting an antibody-antigen 
complex formed in step (a) which contains the antigen polypeptid . 

Another aspect of the invention is a vaccine for treatment of HCV infection comprising an immunogenic 
polypeptide containing an HCV epitope encod d in HCV cDNA, wherein the HCV cDNA is of a sequence 
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indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17 or is the sequence present in 
cion I3i, or clone 26j t or clone 59a, or clone 84a, or clone CA156e, or clone 167b, or clone pM4a, or clone 
CA2l6a, or clone CA290a, or clone ag30a, or clone 205a, or clone I8g, or cion 16jh, and wherein the 
immunogenic polypeptide is present in a pharmacologically effective dose in a pharmaceutically acceptable 
5 excipient. 

Yet another aspect of the invention is a method for producing antibodies to HCV comprising administer- 
ing to an individual an isolated immunogenic polyeptide containing an HCV epitope encoded in HCV cDNA, 
wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in 
Fig. 17, or is of the sequence present in clone CA279a, or clone CA74a, or clone 13i, or clone CA290a, or 
io clone 33C or clone 40b, or clone 33b. or clone 25c, or clone 14c, or clone 8f, or clone 33f. or clone 33g, or 
clone 39c, or clone 15e, and wherein the immunogenic polypeptide is present in a pharmacologically 

effective dose in a pharmaceutically acceptable excipient. 

Still another aspect of the invention is an antisense polynucleotide derived from HCV cDNA, wherein the 

HCV cDNA is that shown in Fig. 17. 

75 Yet another aspect of the invention is a method for preparing purified fusion polypeptide Cl 00-3 
comprising: 

(a) providing a crude cell lysate containing polypeptide Cl 00-3, 

(b) treating the crude cell lysate with an amount of acetone which causes the polypeptide to 
precipitate, g and solubilizing the precipitated 

20 (c) isolatin material, 

(d) isolating the Cl 00-3 polypeptide by anion exchange chromatography, and 

(e) further isolating the Cl 00-3 polypeptide of step (d) by gel filtration. 


25 Brief Description of the Drawings 


30 


35 


40 


45 


50 


55 


Ftg. 1 shows the sequence of the HCV cDNA in clone 12f, and the amino acids encoded therein. 

Fig. 2 shows the HCV cDNA sequence in clone k9-1 . and the amino acids encoded therein. 

Fig. 3 shows the sequence of clone 1 5e, and the amino acids encoded therein. 

Fig. 4 shows the nucleotide sequence of HCV cDNA in clone 13i, the amino acids encoded therein, 
and the sequences which overlap with clone 12f. 

Fig. 5 shows the nucleotide sequence of HCV cDNA in clone 26j, the amino acids encoded therein, 
and the sequences which overlap clone 13i. 

Fig. 6 shows the nucleotide sequence of HCV cDNA in clone CA59a, the amino acids encoded 
therein, and the sequences which overlap with clones 26j and K9-1 . 

Fig. 7 shows the nucleotide sequence of HCV cDNA in clone CA84a, the amino acids encoded 
therein, and the sequences which overlap with clone CA59a. 

Fig. 8 shows the nucleotide sequence of HCV cDNA in clone CA156e, the amino acids encoded 
therein, and the sequences which overlap with CA84a. 

Fig. 9 shows the nucleotide sequence of HCV cDNA in clone CA167b, the amino acids encoded 
therein, and the sequences which overlap CA156e. 

Fig. 10 shows the nucleotide sequence of HCV cDNA in clone CA216a, the amino acids encoded 
therein, and the overlap with clone CA167b. . 

Fig. 1 1 shows the nucleotide sequence of HCV cDNA in clone CA290a, the amino acids encoded 
therein, and the overlap with clone CA216a. 

Rg. 12 shows the nucleotide sequence of HCV cDNA in clone ag30a and the overlap with clone 


CA290a. 

Rg. 13 shows tho nucleotide sequence of HCV cDNA in clone CA205a, and the overlap with the HCV 
cDNA sequence in clone CA290a. 

Rg. 14 shows the nucleotide sequence of HCV cDNA in clone 18g, and the overlap with the HCV 
cDNA sequence in clone ag30a. 

Rg. 15 shows the nucleotide sequence of HCV cDNA in clone I6jh, the amino acids encoded therein, 
and the overlap of nucleotides with the HCV cDNA sequence in clone 15e. 

Rg. 16 shows the ORF of HCV cDNA derived from clones pi 14a, CA167b. CA156e, CA84a. CA59a, 
K9-1. 12f, 14i, 11b, 7f, 7e, 8h, 33c. 40b. 37b, 35, 36, 81, 32, 33b, 25c, 14c, 8f, 33f, 33g, 39c. 35f, 19g, 26g, 
and 15e. 

Rg. 17 shows the sense strand of the compiled HCV cDNA sequence deriv d from the above- 
described clones and the compiled HCV cDNA s quence published in EPO Pub. No. 318,216. The clones 
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from which the sequence was derived are b114a t 18g, ag30a, CA205a, CA290a, CA216a, pi 14a, CA167b, 
CA156e. CA84a, CA59a, K9-1 (also called k9-1),26j, 13i, 12f, 14i, lib, 7f, 7e, 8h, 33c, 40b, 37b, 35, 36, 81, 
32, 33b, 25c, 14c, 8f, 33f, 33g, 39c, 35f, 19g, 26g, 15e, b5a, and 16jh. In the figure the three horizontal 
dashes above the sequence indicate the position of the putative initiator methionine codon; the two vertical 
s dashes indicate the first and last nucleotides of the published sequence. Also shown in the figure is the 
amino acid sequence of the putative polyprotein encoded in the HCV cDNA. 

Fig. 18 is a diagram of the immunological colony screening method used in antigenic mapping 

studies. 

Fig. 19 shows the hydrophobicity profiles of polyproteins encoded in HCV and in West Nile virus. 
jo Fig. 20 is a tracing of the hydrophilicity /hydrophobicity profile and of the antigenic index of the 

putative HCV poly protein. 

Rg. 21 shows the conserved co-linear peptides in HCV and Raviviruses. 

Modes for Carrying Out the Invention 
is 

I. Definitions 

The term "hepatitis C virus" has been reserved by workers in the field for an heretofore unknown 
20 etiotogic agent of NANBH. Accordingly, as used herein, "hepatitis C virus" (HCV) refers to an agent 
causitive of NANBH, which was formerly referred to as NANBV and/or BB-NANBV. The terms HCV, 
NANBV, and BB-NANBV are used interchangeably herein. As an extension of this terminology, the disease 
caused by HCV, formerly called NANB hepatitis (NANBH), is called hepatitis C. The terms NANBH and 
hepatitis C may be used interchangeably herein. 

25 The term "HCV", as used herein, denotes a viral species of which pathogenic strains cause NANBH, 
and attenuated strains or defective interfering particles derived therefrom. As shown infra., the HCV genome 
is comprised of RNA. It is known that RNA containing viruses have relatively high rates of spontaneous 
mutation, i.e., reportedly on the order of 10~ 3 to 10" 4 per incorporated nucleotide (Fields & Knipe (1986)). 
Therefore, there are multiple strains, which may be virulent or avirulent. within the HCV species described 
30 infra. The compositions and methods described herein, enable the propagation, identification, detection, and 
isolation of the various HCV strains or isolates. Moreover, the disclosure herein allows the preparation of 
diagnostics and vaccines for the various strains, as well as compositions and methods that have utility in 
screening procedures for anti-viral agents for pharmacologic use, such as agents that inhibit replication of 
HCV. 

35 The information provided herein, although derived from the prototype strain or isolate of HCV, 
hereinafter referred to as CDC/HCV1 (also called HCV1), is sufficient to allow a viral taxonomist to identify 
other strains which fail within the species. The information provided herein allows the belief that HCV is a 
Flavi-like virus. The morphology and composition of Flavivirus particles are known, and are discussed in 
Brinton (1986). Generally, with respect to morphology, Flaviviruses contain a central nucleocapsid sur- 
40 rounded by a lipid bilayer. Virions are spherical and have a diameter of about 40-50 nm. Their cores are 
about 25-30 nm in diameter. Along the outer surface of the virion envelope are projections that are about 5- 
10 nm long with terminal knobs about 2 nm in diameter. 

Different strains or isolates of HCV are expected to contain variations at the amino acid and nucleic 
acids compared with the prototype isolate, HCV1. Many isolates are expected to show much (i.e, more than 
45 about 40%) homology in the total amino acid sequence compared with HCV1. However, it may also be 
found that other less homologous HCV isolates. These would be defined as HCV strains according to 
—various criteria _such_as_an ORF of approximately 9,000 nucleotides to ap proximately 12,000 nucleotides, 
encoding a polyprotein similar in size to that^f HCVIvan encoded-polyprotein^of-SimilarJhydrophobic and 
antigenic character to that of HCV1 , and the presence of co-linear peptide sequences that are conserved 
so with HCV1 . In addition, the genome would be a positive-stranded RNA. 

HCV encodes at least one epitope which is immunologically identifiable with an epitope in the HCV 
genome from which the cDNAs described herein are derived; preferably the epitope is contained an amino 
acid sequence described herein. The epitope is unique to HCV when compared to oth r known Flaviviruses. 
The uniqueness of the epitope may be determined by its immunological reactivity with anti-HCV antibodies 
55 and lack of immunological reactivity with antibodies to other Flavivirus species. Methods for determining 
immunological reactivity are known in the art, for example, by radioimmunoassay, by Elisa assay, by 
hemagglutination, and several examples of suitable techniques for assays ar provided herein. 

In addition to the abov . the following parameters of nucleic acid homology and amino acid homology 
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are applicable, either alone or in combination, in identifying a strain or isolate as HCV. Since HCV strains 
and isolates are evolutionarily related, it is expected that the overall homology of the genomes at the 
nucleotide level probably will be about 40% or greater, probably about 60% or greater, and even more 
probably about 80% or greater; and in addition that there will be corresponding contiguous sequences of at 
5 least about 13 nucleotides. The correspondence between the putative HCV strain genomic sequence and 
the CDC/HCV1 cDNA sequence can be determined by techniques known in the art For example, they can 
be determined by a direct comparison of the sequence information of the polynucleotide from the putative 
HCV, and the HCV cDNA sequence(s) described herein. For example, also, they can be determined by 
hybridization of the polynucleotides under conditions which form stable duplexes between homologous 
w regions (for example, those which would be used prior to St digestion), followed by digestion with single 
stranded specific nuclease(s), followed by size determination of the digested fragments. 

Because of the evolutionary relationship of the strains or isolates of HCV, putative HCV strains or 
isolates are identifiable by their homology at the polypeptide level. Generally, HCV strains or isolates are 
expected to be more than about 40% homologous, probably more than about 70% homologous, and even 
J 5 more probably more than about 80% homologous, and some may even be more than about 90% 
homologous at the polypeptide level. The techniques for determining amino acid sequence homology are 
known in the art. For example, the amino acid sequence may be determined directly and compared to the 
sequences provided herein. Alternatively the nucleotide sequence of the genomic material of the putative 
HCV may be determined (usually via a cDNA intermediate), the amino acid sequence encoded therein can 
20 be determined, and the corresponding regions compared. 

As used herein, a polynucleotide "derived from" a designated sequence refers to a polynucleotide 
sequence which is comprised of a sequence of approximately at least about 6 nucleotides, preferably at 
least about 8 nucleotides, more preferably at least about 10-1 2 nucleotides, and even more preferably at 
least about 15-20 nucleotides corresponding to a region of the designated nucleotide sequence. 
25 "Corresponding" means homologous to or complementary to the designated sequence. Preferably, the 
sequence of the region from which the polynucleotide is derived is homologous to or complementary to a 
sequence which is unique to an HCV genome. Whether or not a sequence is unique to the HCV genome 
can be determined by techniques known to those of skill in the art. For example, the sequence can be 
compared to sequences in databanks, e.g., Genebank, to determine whether it is present in the uninfected 
30 host or other organisms. The sequence can also be compared to the known sequences of other viral 
agents, including those which are known to induce hepatitis, e.g., HAV, HBV, and HDV, and to other 
members of the Flaviviridae. The correspondence or non-correspondence of the derived sequence to other 
sequences can also be determined by hybridization under the appropriate stringency conditions. Hybridiza- 
tion techniques for determining the complementarity of nucleic acid sequences are known in the art, and 
35 are discussed infra. See also, for example, Maniatis et ai. (1982). In addition, mismatches of duplex 
polynucleotides formed by hybridization can be determined by known techniques, including for example, 
digestion with a nuclease such as SI that specifically digests single-stranded areas in duplex poly- 
nucleotides. Regions from which typical DNA sequences may be "derived" include but are not limited to, 
for example, regions encoding specific epitopes, as well as non-transcribed and/or non-translated regions. 

40 The derived polynucleotide is not necessarily physically derived from the nucleotide sequence shown, 
but may be generated in any manner, including for example, chemical synthesis or DNA replication or 
reverse transcription or transcription. In addition, combinations of regions corresponding to that of the 
designated sequence may be modified in ways known in the art to be consistent with an intended use. 

Similarly, a polypeptide or amino acid sequence "derived from" a designated nucleic acid sequence 
45 refers to a polypeptide having an amino acid sequence identical to that of a polypeptide encoded in the 
sequence, or a portion thereof wherein the portion consists of at least 3-5 amino acids, and more preferably 
at least 8-10 amino acids, and even more preferably at least 11-15 amino acids, or which is im- 
munologically identifiable with a polypeptide encoded in the sequence. 

A recombinant or derived polypeptide is not necessarily translated from a designated nucleic acid 
so sequence, for example, the HCV cDNA sequences described herein, or from an HCV genome; it may be 
generated in any manner, including for example, chemical synthesis, or expression of a recombinant 
expression system, or isolation from mutated HCV. A recombinant or derived polypeptide may include one 
or more analogs of amino acids or unnatural amino acids in its sequence. Methods of inserting analogs of 
amino acids into a sequence are known in the art. It also may include one or more labels, which are known 
55 to those of skill in the art 

The term "recombinant polynucleotide" as used herein intends a polynucleotid of genomic, cDNA, 
semisynthetic, or synthetic origin which, by virtue of its origin or manipulation which: (1) is not associated 
with all or a portion of a polynucleotide with which it is associat d in nature, (2) is linked to a polynucleotide 
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other than that to which it is linked in nature, or (3) does not occur in nature. 

The term "poly nucleotide” as used herein refers to a polymeric form of nucleotides of any length, either 
ribonucleotides or deoxyribonucleotides. This term refers only to the primary structure of the molecule. 
Thus, this term includes double- and single-stranded DNA, as well as double- and single stranded RNA. It 
5 also includes known types Of modifications, for example, labels which are known in the art, methylation, 
"caps", substitution of one or more of the naturally occurring nucleotides with an analog, internucleotide 
modifications such as, for example, those with uncharged linkages (e.g., methyl phosphonates, 
phosphotriesters, phosphoamidates. carbamates, etc.) and with charged linkages (e.g., phosphorothioates, 
phosphorodithioates, etc.), those containing pendant moieties, such as, for example proteins (including for 
io e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.), those with intercalators (e.g., 
acridine, psoralen, etc.), those containing chelators (e.g., metals, radioactive metals, boron, oxidative metals, 
etc.), those containing alkylators, those with modified linkages (e.g., alpha anomeric nucleic acids, etc.), as 
well as unmodified forms of the polynucleotide. 

The term "purified viral polynucleotide" refers to an HCV genome or fragment thereof which is 
is essentially free, i.e., contains less than about 50%, preferably less than about 70%. and even more 
preferably less than about 90% of polypeptides with which the viral polynucleotide is naturally associated. 
Techniques for purifying viral polynucleotides from viral particles are known in the art, and include for 
example, disruption of the particle with a chaotropic agent, differential extraction and separation of the 
polynucleotide(s) and polypeptides by ion-exchange chromatography, affinity chromatography, and sedi- 
20 mentation according to density. 

The term "purified viral polypeptide" refers to an HCV polypeptide or fragment thereof which is 
essentially free, i.e., contains less than about 50%, preferably less than about 70%, and even more 
preferably less than about 90%, of cellular components with which the viral polypeptide is naturally 
associated. Techniques for purifying viral polypeptides are known in the art, and examples of these 
25 techniques are discussed infra. The term "purified viral polynucleotide" refers to an HCV genome or 
fragment thereof which is essentially free, i.e., contains less than about 20%, preferably less than about 
50%. and even more preferably less than about 70% of polypeptides with which the viral polynucleotide is 
naturally associated. Techniques for purifying viral polynucleotides from viral particles are known in the- art, 
and include for example, disruption of the particle with a chaotropic agent, and separation of the 
so polynucleotide(s) and polypeptides by ion-exchange chromatography, affinity chromatography, and sedi- 
mentation according to density. 

"Recombinant host cells", "host cells", "cells", "cell lines", "cell cultures", and other such terms 
denoting microorganisms or higher eukaryotic cell lines cultured as unicellular entities refer to cells which 
can be, or have been, used as recipients for recombinant vector or other transfer DNA, and include the 
35 progeny of the original cell which has been transfected. It is understood that the progeny of a single 
parental cell may not necessarily be completely identical in morphology or in genomic or total DNA 
complement as the original parent, due to natural, accidental, or deliberate mutation. 

A "replicon” is any genetic element, e.g., a plasmid, a chromosome, a virus, a cosmid, etc. that 
behaves as an autonomous unit of polynucleotide replication within a cell; i.e., capable of replication under 
40 its own control. 

A "vector" is a replicon in which another polynucleotide segment is attached, so as to bring about the 
replication and/or expression of the attached segment. 

"Control sequence" refers to polynucleotide sequences which are necessary to effect the expression of 
coding sequences to which they are ligated. The nature of such control sequences differs depending upon 
45 the host organism; in prokaryotes, such control sequences generally include promoter, ribosomal binding 
site, and terminators; in eukaryotes, generally, such control sequences include promoters, terminators and, 

-_in_some in stances, enh ancers. The term "control sequences" is intended to include, at a minimum, all 

components whose presence is necessary for "expression, -and -may- also include -additional components 
whose presence is advantageous, for example, leader sequences, 
so "Operably linked" refers to a juxtaposition wherein the components so described are in a relationship 
permitting them to function in their intended manner. A control sequence "operably linked" to a coding 
sequence is ligated in such a way that expression of the coding sequence is achieved under conditions 
compatibl with the control sequences. 

An "open reading frame" (ORF) is a region of a polynucleotide sequenc which encodes a polypeptide; 
55 this region may represent a portion of a coding sequence or a total coding sequence. 

A "coding sequence" is a polynucleotide sequence which is transcribed into mRNA and/or translated 
into a polypeptide when placed under th control of appropriate regulatory sequences. The boundari s of 
the coding sequence ar determined by a translation start codon at the 5 -terminus and a translation stop 
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codon at the 3 -terminus. A coding sequence can include, but is not limited to mRNA, cDNA, and 
recombinant polynucleotide sequences. 

"Immunologically identifiable with/as" refers to the presence of epitope(s) and polypeptides(s) which are 
also present in th designated polypeptide(s). usually HCV proteins. Immunological identity may be 
s determined by antibody binding and/or competition in binding; these techniques are known to those of 
average skill in the art, and are also illustrated infra. 

As used herein, "epitope” refers to an antigenic determinant of a polypeptide; an epitope could 
comprise 3 amino acids in a spatial conformation which is unique to the epitope, generally an epitope 
consists of at least 5 such amino acids, and more usually, consists of at least 8-10 such amino acids. 
w Methods of determining the spatial conformation of amino acids are known in the art, and include, for 
example, x-ray crystallography and 2-dimensional nuclear magnetic resonance. 

A polypeptide is "immunologically reactive" with an antibody when it binds to an antibody due to 
antibody recognition of a specific epitope contained within the polypeptide. Immunological reactivity may be 
determined by antibody binding, more particularly by the kinetics of antibody binding, and/or by competition 
is in binding using as competitor(s) a known polypeptide(s) containing an epitope against which the antibody 
is directed. The techniques for determining whether a polypeptide is immunologically reactive with an 
antibody are known in the art. 

As used herein, the term "immunogenic polypeptide" is a polypeptide that elicits a cellular and/or 
humoral response, whether alone or linked to a carrier in the presence or absence of an adjuvant. 

20 The term "polypeptide" refers to a polymer of amino acids and does not refer to a specific length of the 
product; thus, peptides, oligopeptides, and proteins are included within the definition of polypeptide. This 
term also does not refer to or exclude post-expression modifications of the polypeptide, for example, 
glycosylations, acetylations, phosphorylations and the like. Included within the definition are, for example, 
polypeptides containing one or more analogs of an amino acid (including, for example, unnatural amino 
25 acids, etc.), polypeptides with substituted linkages, as well as other modifications known in the art, both 
naturally occurring and non-naturally occurring. 

"Transformation”, as used herein, refers to the insertion of an exogenous polynucleotide into a host cell, 
irrespective of the method used for the insertion, for example, direct uptake, transduction, f-mating or 
electroporation. The exogenous polynucleotide may be maintained as a non-integ rated vector, for example, 
30 a plasmid, or alternatively, may be integrated into the host genome. 

"Treatment” as used herein refers to prophylaxis and/or therapy. 

An "individual", as used herein, refers to vertebrates, particularly members of the mammalian species, 
and includes but is not limited to domestic animals, sports animals, and primates, including humans. 

As used herein, the "sense strand” of a nucleic acid contains the sequence that has sequence 
35 homology to that of mRNA. The "anti-sense strand” contains a sequence which is complementary to that of 
the "sense strand". 

As used herein, a "positive stranded genome" of a virus is one in which the genome, whether RNA or 
DNA, is single-stranded and which encodes a viral polypeptide(s). Examples of positive stranded RNA 
viruses include Togaviridae, Coronaviridae, Retroviridae, Picornaviridae, and Caliciviridae. Included also, are 
40 the Flaviviridae, which were formerly classified as Togaviradae. See Reids & Knipe (1 986). 

As used herein, "antibody-containing body component" refers to a component of an individual's body 
which is a source of the antibodies of interest. Antibody containing body components are known in the art. 
and include but are not limited to, for example, plasma, serum, spinal fluid, lymph fluid, the external 
sections of the respiratory, intestinal, and genitourinary tracts, tears, saliva, milk, white blood cells, and 
45 myelomas. 

As used herein, "purified HCV" refers to a preparation of HCV which has been isolated from the cellular 
constituents with which the virus is normally associated, and from other types of viruses which may be 
present in the infected tissue. The techniques for isolating viruses are known to those of skill in the art, and 
include, for example, centrifugation and affinity chromatography; a method of preparing purified HCV is 
so discussed infra. 

The term "HCV particles” as used herein include entire virion as well as particles which are 
intermediates in virion formation. HCV particles generally have one or more HCV proteins associated with 
the HCV nucleic acid. 

As used herein, the term "probe" refers to a polynucleotide which forms a hybrid structur with a 
55 sequence in a target region, due to complementarity of at least one sequenc in the probe with a sequence 
in the target region. The probe, however, does not contain a sequence complementary to sequence(s) used 
to prime th polymerase chain reaction. 

As used herein, the term "target region" refers to a region of the nucleic acid which is to be amplified 
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and/or detected; 

As used herein, the term "viral RNA", which includes HCV RNA, refers to RNA from th viral genome, 
fragments thereof, transcripts thereof, and mutant sequences derived therefrom. 

As used herein, a "biological sample" refers to a sample of tissue or fluid isolated from an individual, 
s including but not limited to, for example, plasma, serum, spinal fluid, lymph fluid, the external sections of 
the skin, respiratory, intestinal, and genitourinary tracts, tears, saliva, milk, blood cells, tumors, organs, and 
also samples of in vitro cell culture constituents (including but not limited to conditioned medium resulting 
from the growth of cells in cell culture medium, putatively virally infected cells, recombinant cells, and cell 

components). 

70 

II. Description of the Invention 

The practice of the present invention will employ, unless otherwise indicated, conventional techniques 
75 of molecular biology, microbiology, recombinant DNA, and immunology, which are within the skill of the art. 
Such techniques are explained fully in the literature. See e.g., Maniatis, Fitsch & Sambrook, MOLECULAR 
CLONING; A LABORATORY MANUAL (1982); DNA CLONING, VOLUMES I AND II (D.N Glover ed. 1985); 
OLIGONUCLEOTIDE SYNTHESIS (M.J. Gait ed, 1984); NUCLEIC ACID HYBRIDIZATION (B.D. Hames & 
S.J. Higgins eds. 1984); TRANSCRIPTION AND TRANSLATION (B.D. Hames & S.J. Higgins eds. 1984); 
20 ANIMAL CELL CULTURE (R.l. Freshney ed. 1986); IMMOBILIZED CELLS AND ENZYMES (IRL Press, 
1986); B. Perbal, A PRACTICAL GUIDE TO MOLECULAR CLONING (1984); the series, METHODS IN 
ENZYMOLOGY (Academic Press, Inc.); GENE TRANSFER VECTORS FOR MAMMALIAN CELLS (J.H. 
Miller and M.P. Calos eds. 1987, Cold Spring Harbor Laboratory), Methods in Enzymology Vol. 154 and Vol. 
155 (Wu and Grossman, and Wu, eds., respectively), Mayer and Walker, eds. (1987), IMMUNOCHEMICAL 
25 METHODS IN CELL AND MOLECULAR BIOLOGY (Academic Press, London), Scopes, (1987), PROTEIN 
PURIFICATION: PRINCIPLES AND PRACTICE, Second Edition (Springer-Verlag, N.Y.), and HANDBOOK 
OF EXPERIMENTAL IMMUNOLOGY, VOLUMES l-IV (D.M. Weir and C. C. Blackwell eds 1986). All patents, 
patent applications, and publications mentioned herein, both supra and infra, are hereby incorporated herein 

by reference. 

30 The useful materials and processes of the present invention are made possible by the provision of a 
family of nucleotide sequences isolated from cDNA libraries which contain HCV cDNA sequences. These 
cDNA librar ies were derived from nucleic acid sequences present in the plasma of an HCV-infected 
chimpanzee. The construction of one of these libraries, the "c" library (ATCC No. 40394), was reported in 
EPO Pub. No. 318,216. Several of the clones containing HCV cDNA reported herein were obtained from the 
35 "c" library. Although other clones reported herein were obtained from other HCV cDNA libraries, the 

presence of clones containing the sequences in the "c" library was confirmed. As discussed in EPO Pub. 
No. 318,216, the family of HCV cDNA sequences isolated from the "c" library are not of human or 
chimpanzee origin, and show no significant homology to sequences contained within the HBV genome. 

The availability of the HCV cDNAs described herein permits the construction of polynucleotide probes 
4 o which are reagents useful for detecting viral polynucleotides in biological samples, including donated blood. 
For example, from the sequences it is possible to synthesize DNA oligomers of about 8-10 nucleotides, or 
larger, which are useful as hybridization probes to detect the presence of HCV RNA in, for example, 
donated blood, sera of subjects suspected of harboring the virus, or cell culture systems in which the virus 
is replicating. In addition, the cDNA sequences also allow the design and production of HCV specific 
45 polypeptides which are useful as diagnostic reagents for the presence of antibodies raised during HCV 
infection. Antibodies to purified polypeptides derived from the cDNAs may also be used to detect viral 

antigens in biological samples, including, for example, donated blood samples, sera from patients with 

NANBH, and irTlissue culture _ systems being used for- HCV replication. _ Moreover, the_ immunogenic 
polypeptides disclosed herein, which are encoded in portions of the ORF of HCV cDNA shown in Fig. 17, 
so are also useful for HCV screening, diagnosis, and treatment, and for raising antibodies which are also useful 
for these purposes. 

In addition, the novel cDNA sequences described herein enable further characterization of the HCV 
genome. Polynucleotide probes and primers derived from these sequences may be used to amplify 
sequences present in cDNA libraries, and/or to screen cDNA libraries for additional overlapping cDNA 
55 sequences, which, in turn, may be used to obtain more overlapping sequences. As indicated infra, and in 
EPO Pub. No. 318,216, the genom of HCV appears to be RNA comprised primarily of a large open 
reading frame (ORF) which encodes a large polyprot in. 

The HCV cDNA sequences provided herein, the polypeptides derived from these sequences, and the 
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immunogenic polypeptides described herein, as well as antibodies directed against these polyp ptides are 
also useful in the isolation and identification of the blood-born NABV (B8-NANBV) agent(s). For example, 
antibodies directed against HCV epitopes contained in polypeptides derived from th cDNAs may be used 
in processes based upon affinity chromatography to isolate the virus. Alternatively, the antibodies may be 
s used to identify viral particles isolated by other techniques. The viral antigens and the genomic material 
within the isolated viral particles may then be further characterized. 

In addition to the above, the information provided infra allows the identification of additional HCV strains 
or isolates. The isolation and characterization of the additional HCV strains or isolates may be accomplished 
by isolating the nucleic acids from body components which contain viral particles and/or viral RNA, creating 
io cDNA libraries using polynucleotide probes based on the HCV cDNA probes described infra., screening the 
libraries for clones containing HCV cDNA sequences described infra., and comparing the HCV cDNAs from 
the new isolates with the cDNAs described infra. The polypeptides encoded therein, or in the viral genome, 
may be monitored for immunological cross-reactivity utilizing the polypeptides and antibodies described 
supra. Strains or isolates which fit within the parameters of HCV, as described in the Definitions section, 
;s supra., are readily identifiable. Other methods for identifying HCV strains will be obvious to those of skill in 
the art, based upon the information provided herein. 

Isolation of the HCV cDNA Sequences 

20 

The novel HCV cDNA sequences described infra, extend the sequence of the cDNA to the HCV 
genome reported in EPO Pub. No. 318,216. The sequences which are present in clones b114a, 18g, ag30a. 
CA205a, CA290a, CA216a t pi 14a, CA167b, CA156e, CA84a, and CA59a lie upstream of the reported 
sequence, and when compiled, yield nucleotides nos. -319 to 1348 of the composite HCV cDNA sequence. 
25 (The negative number on a nucleotide indicates its distance upstream of the nucleotide which starts the 
putative initiator MET codon.) The sequences which are present in clones b5a and 16jh lie downstream of 
the reported sequence, and yield nucleotides nos. 8659 to 8866 of the composite sequence. The composite 
HCV cDNA sequence which includes the sequences in the aforementioned clones, is shown in Fig. 17. 

The novel HCV cDNAs described herein were isolated from a number of HCV cDNA libraries, including 
30 the "c" library present in lambda gtl 1 (ATCC No. 40394). The HCV cDNA libraries were constructed using 
pooled serum from a chimpanzee with chronic HCV infection and containing a high titer of the virus, i.e., at 
least 10 s chimp infectious doses/ml (CID/ml). The pooled serum was used to isolate viral particles; nucleic 
acids isolated from these particles was used as the template in the construe tion of cDNA libraries to the 
viral genome. The procedures for isolation of putative HCV particles and for constructing the ”c" HCV cDNA 
35 library is described in EPO Pub. No. 318,216. Other methods for constructing HCV cDNA libraries are 
known in the art, and some of these methods are described infra., in the Examples. Isolation of the 
sequences was by screening the libraries using synthetic polynucleotide probes, the sequences of which 
were derived from the 5 -region and the 3 -region of the known HCV cDNA sequence. The description of 
the method to retrive the cDNa sequences is mostly of historical interest. The resultant sequences (and 
40 their complements) are provided herein, and the sequences, or any portion thereof, could be prepared 
using synthetic methods, or by a combination of synthetic methods with retrieval of partial sequences using 
methods similar to those described herein. 


45 Preparation of Viral Polypeptides and Fragments 

The availability of HCV cDNA sequences, or nucleotide sequences derived therefrom (including 
segments and modifications of the sequence), permits the construction of expression vectors encoding 
antigenically active regions of the polypeptide encoded in either strand. These antigenically active regions 
so may be derived from coat or envelope antigens or from core antigens, or from antigens which are non- 
structural including, for example, polynucleotide binding proteins, polynucleotide polymerase(s), and other 
viral proteins required for the replication and/or assembly of the virus particle. Fragments encoding the 
desired polypeptides are derived from the cDNA clones using conventional restriction digestion or by 
synthetic methods, and are ligated into vectors which may, for example, contain portions of fusion 
55 sequences such as beta-galactosidase or superoxid dismutase (SOD), preferably SOD. Methods and 
vectors which are useful for the production of polypeptides which contain fusion s quences of SOD are 
described in European Patent Offic Publication number 0196056, published Octob r 1, 1986. Vectors for 
the expression of fusion polypeptides of SOD and HCV polypeptides encoded in a number of HCV clones 
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are described infra., in the Examples. Any desired portion of the HCV cDNA containing an open reading 
frame, in either sense strand, can be obtained as a recombinant polypeptide, such as a mature or fusion 
protein; alternatively, a polypeptide encoded in the cDNA can be provided by chemical synthesis. 

The ON A encoding the desired polypeptide, whether in fused or mature form, and whether or not 
s containing a signal sequence to permit secretion, may be ligated into expression vectors suitable for any 
convenient host. Both eukaryotic and prokaryotic host systems are presently used in forming recombinant 
polypeptides, and a summary of some of the more common control systems and host cell lines is given 
infra. The polypeptide is then isolated from lysed cells or from the culture medium and purified to the extent 
needed for its intended use. Purification may be by techniques known in the art. for example, differential 
70 extraction, salt fractionation, chromatography on ion exchange resins, affinity chromatography, centrifuga- 
tion, and the like. See, for example. Methods in Enzymology for a variety of methods for purifying proteins. 
Such polypeptides can be used as diagnostics, or those which give rise to neutralizing antibodies may be 
formulated into vaccines. Antibodies raised against these polypeptides can also be used as diagnostics, or 
for passive immunotherapy. In addition, as discussed infra., antibodies to these polypeptides are useful for 
75 isolating and identifying HCV particles. 


Preparation of Antigenic Polypeptides and Conjugation with Carrier 

20 An antigenic region of a polypeptide is generally relatively small-typically 8 to 1 0 amino acids or less 
in length. Fragments of as few as 5 amino acids may characterize an antigenic region. These segments 
may correspond to regions of HCV antigen. Accordingly, using the cDNAs of HCV as a basis, DNAs 
encoding short segments of HCV polypeptides can be expressed recombinantly either as fusion proteins, or 
as isolated polypeptides. In addition, short amino acid sequences can be conveniently obtained by chemical 
25 synthesis. In instances wherein the synthesized polypeptide is correctly configured so as to provide the 
correct epitope, but is too small to be immunogenic, the polypeptide may be linked to a suitable carrier. 

A number of techniques for obtaining such linkage are known in the art. including the formation of 
disulfide linkages using N-succinimidyl-3-(2-pyridylthio)propionate (SPDP) and succinimidyl 4-(N-mal- 
eimidomethyl)cyciohexane-1-carboxy!ate (SMCC) obtained from Pierce Company, Rockford, Illinois, (if the 
30 peptide lacks a sulfhydryl group, this can be provided by addition of a cysteine residue.) These reagents 
create a disulfide linkage between themselves and peptide cysteine residues on one protein and an amide 
linkage through the epsilon-amino on a lysine, or other free amino group in the other. A variety of such 
disulfide/amide-forming agents are known. See, for example, Immun. Rev. (1982) 62:185. Other bifunctional 
coupling agents form a thioether rather than a disulfide linkage. Many of these thio-ether-forming agents are 
35 commercially available and include reactive esters of 6-maleimidocaproic acid, 2-bromoacetic acid, 2- 
iodoacetic acid, 4-(N-maleimidomethyl)cyclohexane-1 -carboxylic acid, and the like. The carboxyl groups can 
be activated by combining them with succinimide or 1-hydroxyl-2-nitro-4-sulfonic acid, sodium salt. 
Additional methods of coupling antigens employs the rotavirusTbinding peptide" system described in EPO 
Pub. No. 259,149, the disclosure of which is incorporated herein by reference. The foregoing list is not 
40 meant to be exhaustive, and modifications of the named compounds can clearly be used. 

Any carrier may be used which does not itself induce the production of antibodies harmful to the host. 
Suitable carriers are typically large, slowly metabolized macromolecules such as proteins; polysaccharides, 
such as latex functionalized sepharose, agarose, cellulose, cellulose beads and the like; polymeric amino 
acids, such as polyglutamic acid, polylysine, and the like; amino acid copolymers; and inactive virus 
45 particles. Especially useful protein substrates are serum albumins, keyhole limpet hemocyanin, im- 
munoglobulin molecules, thyroglobulin, ovalbumin, tetanus toxoid, and other proteins well known to those 
.^skille d in the art. 

In addition to full-length~viral~proteinsrDQlypeptides comprising^truncated_ HCV amino ac id sequences 
encoding at least one viral epitope are useful immunological reagents. For example, polypeptides compris- 
so ing such truncated sequences can be used as reagents in an immunoassay. These polypeptides also are 
candidate subunit antigens in compositions for antiserum production or vaccines. While these truncated 
sequences can be produced by various known treatments of native viral protein, it is generally preferred to 
mak synthetic or recombinant polypeptides comprising an HCV sequence. Polypeptides comprising these 
truncated HCV sequences can be made up entirely of HCV sequences (on or mor epitopes, either 
55 contiguous or noncontiguous), or HCV sequences and heterologous sequences in a fusion protein. Useful 
heterologous sequences include sequences that provide for secretion from a recombinant host, enhance the 
immunological reactivity of the HCV epitope(s), or facilitate th coupling of th polypeptid to an 
immunoassay support or a vaccine carrier. See, e.g., EPO Pub. No. 1 1 6.201 ; U.S. Pat. No. 4,722,840; EPO 
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Pub. No. 259,149; U.S. Pat. No. 4,629.783, the disclosures of which are incorporated herein by reference. 

The size of polypeptides comprising the truncated HCV sequences can vary widely, the minimum size 
being a sequence of sufficient size to provide an HCV epitope, while the maximum size is not critical. For 
convenience, the maximum size usually is not substantially greater than that required to provide the desired 
5 HCV epitopes and function(s) of the heterologous sequence, if any. Typically, the truncated HCV amino acid 
sequence will range from about 5 to about 100 amino acids in length. More typically, however, the HCV 
sequence will be a maximum of about 50 amino acids in length, preferably a maximum of about 30 amino 
acids. It is usually desirable to select HCV sequences of at least about 10, 12 or 15 amino acids, up to a 
maximum of about 20 or 25 amino acids. 

;o Truncated HCV amino acid sequences comprising epitopes can be identified in a number of ways. For 
example, the entire viral protein sequence can be screened by preparing a series of short peptides that 
together span the entire protein sequence. An example of antigenic screening of the regions of the HCV 
polyprotein is shown infra. In addition, by starting with, for example. lOOmer polypeptides, it would be 
routine to test each polypeptide for the presence of epitope(s) showing a desired reactivity, and then testing 
is progressively smaller and overlapping fragments from an identified lOOmer to map the epitope of interest. 
Screening such peptides in an immunoassay is within the skill of the art. It is also known to carry out a 
computer analysis of a protein sequence to identify potential epitopes, and then prepare oligopeptides 
comprising the identified regions for screening. Such a computer analysis of the HCV amino acid sequence 
is shown in Fig. 20, where the hydrophilic/hydrophobic character is displayed above the antigen index. The 
20 amino acids are numbered from the starting MET (position 1) as shown in Fig. 17. It is appreciated by those 
of skill in the art that such computer analysis of antigenicity does not always identify an epitope that 
actually exists, and can also incorrectly identify a region of the protein as containing an epitope. Examples 
of HCV amino acid sequences that may be useful, which are expressed from expression vectors comprised 
of clones 5-1-1, 81. CA74a, 35f. 279a, C36, C33b, CA290a, C8f, C12f, 14c, 15e, C25c, C33c, C33f, 33g, 
25 C39c. C40b, CAl67b are described infra. Other examples of HCV amino acid sequences that may be useful 

as described herein are set forth below. It is to be understood that these peptides do not necessarily 
precisely map one epitope, and may also contain HCV sequence that is not immunogenic. These non- 
immunogenic portions of the sequence can be defined as described above using conventional techniques 
and deleted from the described sequences. Further, additional truncated HCV amino acid sequences that 
30 comprise an epitope or are immunogenic can be identified as described above. The following sequences 
are given by amino acid number (i.e., "AAn") where n is the amino acid number as shown in Fig. 17; 
AA1-AA25; AA1-AA50; AA1-AA84; AA9-AA1 77; AA1-AA10; AA5-AA20; AA20-AA25; AA35-AA45; AA50- 
AA100; AA40-AA90; AA45-AA65; AA65-AA75; AA80-90; AA99-AA120; AA95-AA110; AA105-AA120; AA100- 
AA150; AA1 50-AA200; AA155-AA170; AA190-AA210; AA200-AA250; AA220-AA240; AA245-AA265; AA250- 
35 AA300; AA290-AA330; AA290-305; AA300-AA350; AA310-AA330; AA350-AA400; AA380-AA395; AA405- 

AA495; AA400-AA450; AA405-AA415; AA415-AA425: AA425-AA435; AA437-AA582; AA450-AA500; AA440- 
AA460; AA460-AA470; AA475-AA495; AA500-AA550; AA511-AA690; AA515-AA550; AA550-AA600: AA550- 
AA625; AA575-AA605; AA585-AA600; AA600-AA650; AA600-AA625; AA635-AA665; AA650-AA700; AA645- 
AA680; AA700-AA750; AA700-AA725; AA700-AA750: AA725-AA775; AA770-AA790: AA750-AA800; AA800- 
40 AA815; AA825-AA850; AA850-AA875; AA800-AA850; AA920-AA990; AA850-AA900; AA920-AA945; AA940- 

AA965: AA970-AA990, AA950-AA1000; AA1000-AA1060; AA1000-AA1D25; AA1000-AA1050; AA1025- 
AA1040; AA1040-AA1055; AA1075-AA1175; AA1050-AA1200; AA1070-AA1100: AA1100-AA1130; AA1140- 
AA1165; AA1 1 92-AA1 457; AA1195-AA1250; AA1200-AA1225; AA1225-AA1250: AA1250-AA1300; AA1260- 
AA1310; AA1260-AA1 280; AA1266-AA1428; AA1300-AA1350; AA1290-AA1310; AA1310-AA1340; AA1345- 
45 AA1405; AA1345-AA1365; AA1350-AA1400; AA1365-AA1380; AA1380-AA1405; AA1400-AA1450; AA1450- 

AA1500; AA1460-AA1475; AA1475-AA1515; AA1475-AA1500; AA1 500-AA1 550: AA1500-AA1515; AA1515- 
AA1550; AA1 550-AA1 600; AA1545-AA1560; AA1569-AA1931; AA1 570-AA1 590: AA1595-AA1610; AA1590- 
AA1650; AA1610-AA1645; AA1650-AA1690; AA1685-AA1770; AA1 689-AA1 805; AA1690-AA1720: AA1694- 
AA1735; AA1720-AA1745; AA1745-AA1770; AA1750-AA1800; AA1775-AA1810; AA1795-AA1850; AA1850- 
50 AA1900; AA1900-AA1 950;' AA1 900-AA1 920; AA1916-AA2021; AA1920-AA1940; AA1949-AA2124; AA1950- 

AA2000; AA1950-AA1985; AA1980-AA2000; AA2000-AA2050; AA2005-AA2025; AA2020-AA2045; AA2045- 
AA2100: AA2045-AA2070; AA2054-AA2223; AA2070-AA2100; AA2100-AA2150; AA2150-AA2200; AA2200- 
AA2250; AA2200-AA2325; AA2250-AA2330; AA2255-AA2270; AA2265-AA2280; AA2280-AA2290; AA2287- 
AA2385: AA2300-AA2350; AA2290-AA2310; AA2310-AA2330; AA2330-AA2350: AA2350-AA2400: AA2348- 
55 AA2464; AA2345-AA241 5; AA2345-AA2375; AA2370-AA2410; AA2371 -AA2502; AA2400-AA2450: AA2400- 

AA2425; AA2415-AA2450; AA2445-AA2500; AA2445-AA2475; AA2470-AA2490: AA2500-AA2550; AA2505- 
AA2540; AA2535-AA2560; AA2550-AA2600; AA2560-AA2580; AA2600-AA2650; AA2605-AA2620; AA2620- 
AA2650; AA2640-AA2660; AA2650-AA2700: AA2655-AA2670; AA2670-AA2700; AA2700-AA2750; AA2740- 
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AA2760; AA2750-AA2800; AA2755-AA2780; AA2780-AA2830; AA2785-AA2810; AA2796-AA2886; AA2810- 
AA2825; AA2800-AA2850; AA2850-AA2900; AA2850-AA2865; AA2885-AA2905; AA2900-AA2950; AA2910- 
AA2930: AA2925-AA2950; AA2945-end(C’ terminal). 

The above HCV amino acid sequences can be prepared as discrete peptides or incorporated into a 
5 larger polypeptide, and may find use as described herein. Additional polypeptides comprising truncated 
HCV sequences are described in the examples. 

The observed relationship of the putative polyproteins of HCV and the Flaviviruses allows some 
prediction of the putative domains of the HCV "non-structural" (NS) proteins. The locations of the individual 
NS proteins in the putative Flavivirus precursor polyprotein are fairly well-known. Moreover, these also 
io coincide with observed gross fluctuations in the hydrophobicity profile of the polyprotein. It is established 
that NS5 of Flaviviruses encodes the virion polymerase, and that NS1 corresponds with a complement 
fixation antigen which has been shown to be an effective vaccine in animals. Recently, it has been shown 
that a flaviviral protease function resides in NS3. Due to the observed similarities betwen HCV and the 
Flaviviruses, described infra., deductions concerning the approximate locations of the corresponding protein 
75 domains and functions in the HCV poly protein are possible. The expression of polypeptides containing 
these domains in a variety of recombinant host cells, including, for example, bacteria, yeast, insect, and 
vertebrate cells, should give rise to important immunological reagents which can be used for diagnosis, 
detection, and vaccines. 

Although the non-structural protein regions of the putative polyproteins of the HCV isolate described 
20 herein and of Flaviviruses appear to have some similarity, there is less similarity between the putative 
structural regions which are towards the N-terminus. In this region, there is a greater divergence in 
sequence, and in addi tion, the hydrophobic profile of the two regions show less similarity. This 
"divergence" begins in the N-terminal region of the putative NS1 domain in HCV, and extends to the 
presumed N-terminus. Nevertheless, it may still be possible to predict the approximate locations of the 
25 putative nucleocapsid (N-terminal basic domain) and E (generally hydrophobic) domains within the HCV 
polyprotein. In the Examples the predictions are based on the changes observed in the hydrophobic profile 
of the HCV poly protein, and on a knowledge of the location and character of the flaviviral proteins. From 
these predictions it may be possible to identify approximate regions of the HCV polyprotein that could 
correspond with useful immunological reagents. For example, the E and NS1 proteins of Flaviviruses are 
30 known to have efficacy as protective vaccines. These regions, as well as some which are shown to be 
antigenic in the HCV isolate described herein, for example those within putative NS3, C, and NS5, etc., 
should also provide diagnostic reagents. Moreover, the location and expression of viral-encoded enzymes 
may also allow the evaluation of anti-viral enzyme inhibitors, i.e., for example, inhibitors which prevent 
enzyme activity by virtue of an interaction with the enzyme itself, or substances which may prevent 
35 expression of the enzyme, (for example, anti-sense RNA, or other drugs which interfere with expression). 


Preparation of Hybrid Particle Immunogens Containing HCV Epitopes 

40 The immunogenicity of the epitopes of HCV may also be enhanced by preparing them in mammalian or 
yeast systems fused with or assembled with particle-forming proteins such as. for example, that associated 
with hepatitis B surface antigen. Constructs wherein the NANBV epitope is linked directly to the particle- 
forming protein coding sequences produce hybrids which are immunogenic with respect to the HCV 
epitope. In addition, all of the vectors prepared include epitopes specific to HBV, having various degrees of 
45 immunogenicity, such as, for example, the pre-S peptide. Thus, particles constructed from particle forming 
protein which include HCV sequences are immunogenic with respect to HCV and HBV. 

_ Hepatitis s urfac e antigen (HBSAg) has been shown to be formed and assembled into particles in S. 

cerevisiae (Valenzuel^t^S77T9fi2))7^^weiras inrfor example.-mammalian cells_(ValenzueJa, P., et al. 
(1984)). The formation of such particles has been shown to enhance the immunogenicity of the monomer 
so subunit. The constructs may also include the immunodominant epitope of HBSAg, comprising the 55 amine) 
acids of the presurface (pre-S) region. Neurath et al. (1984). Constructs of the pre-S-HBSAg particle 
expressible in yeast are disclosed in EPO 174,444, published March 19, 1986; hybrids including heterolo- 
gous viral sequences for yeast expression are disclos d in EPO 175,261, published March 26, 1966. These 
constructs may also be expressed in mammalian cells such as Chinese hamster ovary (CHO) cells using an 

55 SV40-dihydrofolate reductase vector (Michelle et al. (1984)). 

In addition, portions of the particle-forming protein coding sequenc may b replaced with codons 
encoding an HCV epitope. In this replacement, regions which are not required to mediat th aggregation of 
the units to form immunogenic particles in yeast or mammals can b deleted, thus eliminating additional 


16 



EP 0 388 232 A1 


HBV antigenic sites from competition with the HCV epitope. 


Preparation of Vaccines 
s 

Vaccines may be prepared from one or more immunogenic polypeptides derived from HCV cDNA, 
including the cDNA sequences described in the Examples. The observed homology between HCV and 
Flaviviruses provides information concerning the polypeptides which may be most effective as vaccines, as 
well as the regions of the genome in which they are encoded. The general structure of the Flavivirus 
io genome is discussed in Rice et ai (1986). The flavivirus genomic RNA is believed to be the only virus- 
specific mRNA species, and it is translated into the three viral structural proteins, i.e., C, M, and E, as well 
as two large nonstructural proteins, NS4 and NS5, and a complex set of smaller nonstructural proteins. It is 
known that major neutralizing epitopes for Flaviviruses reside in the E (envelope) protein (Roehrig (1986)). 
Thus, vaccines may be comprised of recombinant polypeptides containing epitopes of HCV E. These 
75 polypeptides may be expressed in bacteria, yeast, or mammalian cells, or alternatively may be isolated 
from viral preparations. It is also anticipated that the other structural proteins may also contain epitopes 
which give rise to protective anti-HCV antibodies. Thus, polypeptides containing the epitopes of E, C, and M 
may also be used, whether singly or in combination, in HCV vaccines. 

In addition to the above, it has been shown that immunization with NS1 (nonstructural protein 1), results 
20 in protection against yellow fever (Schlesinger et al (1986)). This is true even though the immunization does 
not give rise to neutralizing antibodies. Thus, particularly since this protein appears to be highly conserved 
among Flaviviruses, it is likely that HCV NS1 will also be protective against HCV infection. Moreover, it also 
shows that nonstructural proteins may provide protection against viral pathogenicity, even if they do not 
cause the production of neutralizing antibodies. 

25 The information provided in the Examples concerning the immunogenicity of the polypeptides ex- 
pressed from cloned HCV cDNAs which span the various regions of the HCV ORF also allows predictions 
concerning their use in vaccines. 

In view of the above, multivalent vaccines against HCV may be comprised of one or more epitopes 
from one or more structural proteins, and/or one or more epitopes from one or more nonstructural proteins. 
30 These vaccines may be comprised of, for example, recombinant HCV polypeptides and/or polypeptides 
isolated from the virions. In particular, vaccines are contemplated comprising one or more of the following 
HCV proteins, or subunit antigens derived therefrom: E, NS1, C, NS2, NS3, NS4 and NS5. Particularly 
preferred are vaccines comprising E and/or NS1 , or subunits thereof. 

The preparation of vaccines which contain an immunogenic polypeptide(s) as active ingredients, is 
35 known to one skilled in the art. Typically, such vaccines are prepared as injectables, either as liquid 
solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid prior to injection may 
also be prepared. The preparation may also be emulsified, or the protein encapsulated in liposomes. The 
active immunogenic ingredients are often mixed with excipients which are pharmaceutically acceptable and 
compatible with the active ingredient. Suitable excipients are. for example, water, saline, dextrose, glycerol. 
40 ethanol, or the like and combinations thereof. In addition, if desired, the vaccine may contain minor amounts 
of auxiliary substances such as wetting or emulsifying agents, pH buffering agents, and/or adjuvants which 
enhance the effectiveness of the vaccine. Examples of adjuvants which may be effective include but are not 
limited to: aluminum hydroxide, N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP), N-acetyl-nor- 
muramyl-L-alanyl-D-isoglutamine (CGP 11637, referred to as nor-MDP), N-acetylmuramyl-L-alanyl-D- 
45 isoglutaminyl-L-alanine-2-(1 "^-dipalmitoyl-sn-glycero-O-hydroxyphosphoryloxyJ-ethylamine (CGP 1 9835A, 
referred to as MTP-PE), and RIBI, which contains three components extracted from bacteria, mon- 
ophosphoryl lipid A, trehalose dimycolate and cell wall skeleton (MPL + TDM + CWS) in a 2% 
squalene/Tween 80 emulsion. The effectiveness of an adjuvant may be determined by measuring the 
amount of antibodies directed against an immunogenic polypeptide containing an HCV antigenic sequence 
50 resulting from administration of this polypeptide in vaccines which are also comprised of the various 
adjuvants. 

The vaccines are conventionally administered parenterally, by injection, for example, either sub- 
cutaneously or intramuscularly. Additional formulations which are suitable for other modes of administration 
include suppositories and, in some cases, oral formulations. For suppositories, traditional binders and 
55 carriers may include, for exampl , polyalkylene glycols or triglycerides; such suppositories may be formed 
from mixtures containing the active ingredient in the range of 0.5% to 10%, preferably 1%-2%. Oral 
formulations include such normally employed excipients as, for example, pharmaceutical grades of 
mannitol, lactose, starch, magnesium stearate, sodium saccharin , c Hulose, magnesium carbonate, and th 
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like. These compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained release 
formulations or powders and contain 10%-95% of active ingredient, preferably 25%-70%. 

The proteins may be formulated into the vaccine as neutral or salt forms. Pharmaceutically acceptable 
salts include the acid addition salts (formed with free amino groups of the peptid ) and which are formed 
5 with inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids such as 
acetic, oxalic, tartaric, maleic, and the like. Salts formed with the free carboxyl groups may also be derived 
from inorganic bases such as. for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, 
and such organic bases as isopropylamine, trimethylamine, 2-ethylamino ethanol, histidine, procaine, and 
the like. 

TO 


Dosage and Administration of Vaccines 

The vaccines are administered in a manner compatible with the dosage formulation, and in such 

75 amount as will be prophylactically and/or therapeutically effective. The quantity to be administered, which is 
generally in the range of 5 micrograms to 250 micrograms of antigen per dose, depends on the subject to 
be treated, capacity of the subject's immune system to synthesize antibodies, and the degree of protection 
desired. Precise amounts of active ingredient required to be administered may depend on the judgment of 
the practitioner and may be peculiar to each subject. 

20 The vaccine may be given in a single dose schedule, or preferably in a multiple dose schedule. A 
multiple dose schedule is one in which a primary course of vaccination may be with 1-10 separate doses, 
followed by other doses given at subsequent time intervals required to maintain and or reenforce the 
immune response, for example, at 1-4 months for a second dose, and if needed, a subsequent dose(s) after 
several months. The dosage regimen will also, at least in part, be determined by the need of the individual 

25 and be dependent upon the judgment of the practitioner. 

In addition, the vaccine containing the immunogenic HCV antigen(s) may be administered in conjunction 

with other immunoregulatory agents, for example, immune globulins. 


30 Preparation of Antibodies Against HCV Epitopes 


35 


40 


45 


50 


55 


The immunogenic polypeptides prepared as described above are used to produce antibodies, both 
polyclonal and monoclonal. If polyclonal antibodies are desired, a selected mammal (e.g., mouse, rabbit, 
goat, horse, etc.) is immunized with an immunogenic polypeptide bearing an HCV epitope(s). Serum from 
the immunized animal is collected and treated according to known procedures. If serum containing 
polyclonal antibodies to an HCV epitope contains antibodies to other antigens, the polyclonal antibodies can 
be purified by immunoaffinity chromatography. Techniques for producing and processing polyclonal 
antisera are known in the art, see for example, Mayer and Walker (1987). 

Alternatively, polyclonal antibodies may be isolated from a mammal which has been previously infected 
with HCV. An example of a method for purifying antibodies to HCV epitopes from serum from an infected 
individual, based upon affinity chromatography and utilizing a fusion polypeptide of SOD and a polypeptide 


encoded within cDNA clone 5-1-1, is presented in EPO Pub. No. 318,216. 

Monoclonal antibodies directed against HCV epitopes can also be readily produced by one skilled in 
the art The general methodology for making monoclonal antibodies by hybridomas is well known. Immortal 
antibody-producing cell lines can be created by cell fusion, and also by other techniques such as direct 
transformation of B lymphocytes with oncogenic DNA, or transfection with Epstein-Barr virus. See, e.g., M. 
Schrei er et a l. (1980); Hammerling et al. (1981); Kennett et al. (1980); see also . U.S. Patent Nos. 4,341.761; 
4,399.1 21 ; 4,427783^^^887^7466,91 7;~4;472,500;-4;491. 632^^ 

antibodies produced against HCV epitopes can be screened for various properties; i.e., for isotype, epitope 


affinity, etc. 

Antibodies, both monoclonal and polyclonal, which are directed against HCV epitopes are particularly 
useful in diagnosis, and those which are neutralizing are useful in passive immunotherapy. Monoclonal 
antibodies, in particular, may be used to raise anti-idiotype antibodies. 

Anti-idiotype antibodies are immunoglobulins which carry an "internal image" of th antigen of the 
infectious agent against which protection is desired. See, for example. Nisonoff, A., et al. (1981) and 
Dreesman et al. (1985). 

Techniques for raising anti-idiotype antibodies are known in the art. See, for exampl , Grzych (1985), 
MacNamara et al. (1984), and Uytdehaag et al. (1985). These anti-idiotype antibodies may also be useful for 
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treatment and/or diagnosis of NANBH, as well as for an elucidation of the immunogenic regions of HCV 
antigens. 

It would also be recognized by one of ordinary skill in the art that a variety of types of antibodies 
directed against HCV epitopes may be produced. As used herein, the term "antibody” refers to a 
s polypeptide or group of polypeptides which are comprised of at least one antibody combining site. An 
"antibody combining site" or "binding domain" is formed from the folding of variable domains of an 
antibody molecule(s) to form three-dimensional binding spaces with an internal surface shape and charge 
distribution complementary to the features of an epitope of an antigen, which allows an immunological 
reaction with the antigen. An antibody combining site may be formed from a heavy and/or a light chain 
io domain (VH and VL, respectively), which form hypervariable loops which contribute to antigen binding. The 
term "antibody" includes, for example, vertebrate antibodies, hybrid antibodies, chimeric antibodies, altered 
antibodies, univalent antibodies, the Fab proteins, and single domain antibodies. 

A "single domain antibody" (dAb) is an antibody which is comprised of an VH domain, which reacts 
immunologicaiiy with a designated antigen. A dAB does not contain a VL domain, but may contain other 
is antigen binding domains known to exist in antibodies, for example, the kappa and lambda domains. 
Methods for preparing dABs are known in the art. See, for example, Ward et al. (1989). 

Antibodies may also be comprised of VH and VL domains, as well as other known antigen binding 
domains. Examples of these types of antibodies and methods for their preparation are known in the art (see, 
e.g., U.S. Patent No. 4,816,467, which is incorporated herein by reference), and include the following. For 
20 example, "vertebrate antibodies" refers to antibodies which are tetramers or aggregates thereof, comprising 
light and heavy chains which are usually aggregated in a "Y" configuration and which may or may not have 
covalent linkages between the chains. In vertebrate antibodies, the amino acid sequences of all the chains 
of a particular antibody are homologous with the chains found in one antibody produced by the lymphocyte 
which produces that antibody in situ, or in vitro (for example, in hybridomas). Vertebrate antibodies 
25 typicallly include native antibodies, for example, purified polyclonal antibodies and monoclonal antibodies. 
Examples of the methods for the preparation of these antibodies are described infra. 

"Hybrid antibodies" are antibodies wherein one pair of heavy and light chains is homologous to those in 
a first antibody, while the other pair of heavy and light chains is homologous to those in a different second 
antibody. Typically, each of these two pairs will bind different epitopes, particularly on different antigens. 
30 This results in the property of "divalence", i.e., the ability to bind two antigens simultaneously. Such hybrids 
may also be formed using chimeric chains, as set forth below. 

"Chimeric antibodies", are antibodies in which the heavy and/or light chains are fusion proteins. 
Typically the constant domain of the chains is from one particular species and/or class, and the variable 
domains are from a different species and/or class. Also included is any antibody in which either or both of 
35 the heavy or light chains are composed of combinations of sequences mimicking the sequences in 
antibodies of different sources, whether these sources be differing classes, or different species of origin, 
and whether or not the fusion point is at the variable/constant boundary. Thus, it is possible to produce 
antibodies in which neither the constant nor the variable region mimic known antibody sequences. It then 
becomes possible, for example, to construct antibodies whose variable region has a higher specific affinity 
40 for a particular antigen, or whose constant region can elicit enhanced complement fixation, or to make other 
improvements in properties possessed by a particular constant region. 

Another example is "altered antibodies", which refers to antibodies in which the naturally occurring 
amino acid sequence in a vertebrate antibody has been varied. Utilizing recombinant DNA techniques, 
antibodies can be redesigned to obtain desired characteristics. The possible variations are many, and range 
45 from the changing of one or more amino acids to the complete redesign of a region, for example, the 
constant region. Changes in the constant region, in general, to attain desired cellular process characteris- 
tics, e.g., changes in complement fixation, interaction with membranes, and other effector functions. 
Changes in the variable region may be made to alter antigen binding characeristics. The antibody may also 
be engineered to aid the specific delivery of a molecule or substance to a specific cell or tissue site. The 
so desired alterations may be made by known techniques in molecular biology, e.g., recombinant techniques, 
site directed mutagenesis, etc. 

Yet another example are "univalent antibodies", which are aggregates comprised of a heavy chain/light 
chain dimer bound to the Fc (i.e., constant) region of a second heavy chain. This type of antibody escapes 
antigenic modulation. See, e.g., Glennie et al. (1982). 

55 Includ d also within the definition of antibodies are "Fab" fragments of antibodi s. The "Fab" region 
refers to those portions of the heavy and light chains which ar roughly equivalent, or analogous, to the 
sequences which comprise th branch portion of the heavy and light chains, and which have been shown to 
exhibit immunological binding to a specified antig n, but which lack the effector Fc portion . "Fab" includes 
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aggregates of one heavy and one light chain (commonly known as Fab ), as well as tetramers containing 
the 2H and 2L chains (referred to as F(ab) 2 ), which are capable of selectively reacting with a designated 
antigen or antigen family. Tab" antibodies may be divided into subsets analogous to those described 
above, i.e. "vertebrate Fab", "hybrid Fab", "chimeric Fab", and "altered Fab". Methods of producing "Fab" 
s fragments of antibodies are 'known within the art and include, for example, proteolysis, and synthesis by 

recombinant techniques. 


II. H. Diagnostic Oligonucleotide Probes and Kits 

i° 

Using the disclosed portions of the isolated HCV cDNAs as a basis, oligomers of approximately 8 
nucleotides or more can be prepared, either by excision or synthetically, which hybridize with the HCV 
genome and are useful in identification of the viral agent(s), further characterization of the viral genome(s). 
as well as in detection of the virus(es) in diseased individuals. The probes for HCV polynucleotides (natural 
J 5 or derived) are a length which allows the detection of unique viral sequences by hybridization. While 6-8 
nucleotides may be a workable length, sequences of 10-12 nucleotides are preferred, and about 20 
nucleotides appears optimal. Preferably, these sequences will derive from regions which lack heterogeneity. 
These probes can be prepared using routine methods, including automated oligonucleotide synthetic 
methods. Among useful probes, for example, are those derived from the newly isolated clones disclosed 
20 herein, as well as the various oligomers useful in probing cDNA libraries, set forth below. A complement to 
any unique portion of the HCV genome will be satisfactory. For use as probes, complete complementarity is 
desirable, though it may be unnecessary as the length of the fragment is increased. 

For use of such probes as diagnostics, the biological sample to be analyzed, such as blood or serum, 
may be treated, if desired, to extract the nucleic acids contained therein. The resulting nucleic acid from the 
25 sample may be subjected to gel electrophoresis or other size separation techniques; alternatively, the 
nucleic acid sample may be dot blotted without size separation. The probes are then labeled. Suitable 
labels, and methods for labeling probes are known in the art, and include, for example, radioactive labels 
incorporated by nick translation or kinasing, biotin, fluorescent probes, and chemiluminescent probes. The 
nucleic acids extracted from the sample are then treated with the labeled probe under hybridization 
30 conditions of suitable stringencies, and polynucleotide duplexes containing the probe are detected. 

The probes can be made completely complementary to the HCV genome. Therefore, usually high 
stringency conditions are desirable in order to prevent false positives. However, conditions of high 
stringency should only be used if the probes are complementary to regions of the viral genome which lack 
heterogeneity. The stringency of hybridization is determined by a number of factors during hybridization 
35 and during the washing procedure, including temperature, ionic strength, length of time, and concentration 
of formamide. These factors are outlined in, for example, Maniatis, T. (1982). 

Generally, it is expected that the HCV genome sequences will be present in serum of infected 
individuals at relatively low levels, i.e., at approximately lOMO 3 chimp infectious doses (CID) per ml. This 
level may require that amplification techniques be used in hybridization assays. Such techniques are known 
40 in the art. For example, the Enzo Biochemical Corporation "Bio-Bridge” system uses terminal deox- 
y nucleotide transferase to add unmodified 3 -poly-dT- tails to a DNA probe. The poly dT-tailed probe is 
hybridized to the target nucleotide sequence, and then to a biotin-modified poly- A. PCT application 
84/03520 and EPA1 24221 describe a DNA hybridization assay in which: (1) analyte is annealed to a single- 
stranded DNA probe that is complementary to an enzyme-labeled oligonucleotide; and (2) the resulting 
45 tailed duplex is hybridized to an enzyme-labeled oligonucleotide. EPA 204510 describes a DNA hybridiza- 
tion assay in which analyte DNA is contacted with a probe that has a tail, such as a poly-dT tail, an 

amplifier. strand that has a sequence that hybridizes to the tail of the probe, such as a poly-A sequence, and 

which is capable of bindingT^ur^ity of "1^1^ strandsrA particularly -desirable technique_may_firsrjnyqlve 
amplification of the target HCV sequences in sera approximately 10,000 fold, i.e., to approximately 10 6 
so sequences/ml. This may be accomplished, for example, by the polymerase chain reactions (PCR) technique 
described which is by Saiki et al. (1986), by Mullis. U.S. Patent No. 4,683,195. and by Mullis et al. U.S. 
Patent No. 4,683,202. The amplified sequence(s) may then be detected using a hybridization assay which is 
described in EP 317,077. published May 24, 1989. These hybridization assays, which should detect 
sequences at the level of l0 6 /ml, utilize nucleic acid multimers which bind to singl -stranded analyte 
55 nucleic acid, and which also bind to a multiplicity of single-stranded lab led oligonucleotides. A suitable 
solution phase sandwich assay which may b used with labeled polynucleotid probes, and th methods for 
the preparation of probes is described in EPO 225,807, published June 16, 1987. 

The probes can be packaged into diagnostic kits. Diagnostic kits include the probe DNA, which may be 
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labeled; alternatively, the probe DNA may be unlabeled and the ingredients for labeling may be included in 
the kit in separate containers. The kit may also contain other suitably packaged reagents and materials 
needed for the particular hybridization protocol, for example, standards, as well as instructions for 
conducting the test. 

5 

Immunoassay and Diagnostic Kits 


Both the polypeptides which react immunologically with serum containing HCV antibodies, for example, 
io those detected by the antigenic screening method described infra, in the Examples, as well those derived 
from or encoded within the isolated clones described in the Examples, and composites thereof, and the 
antibodies raised against the HCV specific epitopes in these polypeptides, are useful in immunoassays to 
detect presence of HCV antibodies, or the presence of the virus and/or viral antigens, in biological samples. 
Design of the immunoassays is subject to a great deal of variation, and a variety of these are known in the 
is art. For example, the immunoassay may utilize one viral epitope; alternatively, the immunoassay may use a 
combination of viral epitopes derived from these sources; these epitopes may be derived from the same or 
from different viral polypeptides, and may be in separate recombinant or natural polypeptides, or together in 
the same recombinant polypeptides. It may use, for example, a monoclonal antibody directed towards a 
viral epitope(s), a combination of monoclonal antibodies directed towards epitopes of one viral antigen, 
20 monoclonal antibodies directed towards epitopes of different viral antigens, polyclonal antibodies directed 
towards the same viral antigen, or polyclonal antibodies directed towards different viral antigens. Protocols 
may be based, for example, upon competition, or direct reaction, or sandwich type assays. Protocols may 
also, for example, use solid supports, or may be by immunoprecipitation. Most assays involve the use of 
labeled antibody or polypeptide; the labels may be, for example, fluorescent, chemiluminescent, radioactive, 
25 or dye molecules. Assays which amplify the signals from the probe are also known; examples of which are 
assays which utilize biotin and avidin, and enzyme-labeled and mediated immunoassays, such as ELISA 
assays. 

Some of the antigenic regions of the putative polyprotein have been mapped and identified by 
screening the antigenicitiy of bacterial expression products of HCV cDNAs which encode portions of the 
30 polyprotein. See the Examples. Other antigenic regions of HCV may be detected by expressing the portions 
of the HCV cDNAs in other expression systems, including yeast systems and cellular systems derived from 
insects and vertebrates. In addition, studies giving rise to an antigenicity index and 
hydrophobicity/hydrophilicity profile give rise to information concerning the probability of a region's 
antigenicity. 

35 The studies on antigenic mapping by expression of HCV cDNAs showed that a number of clones 
containing these cDNAs expressed polypeptides which were immunologically reactive with serum from 
individuals with NANBH. No single polypeptide was immunologically reactive with all sera. Five of these 
polypeptides were very immunogenic in that antibodies to the HCV epitopes in these polypeptides were 
detected in many different patient sera, although the overlap in detection was not complete. Thus, the 
40 results on the immunogenicity of the polypeptides encoded in the various clones suggest that effecient 
detection systems may include the use of panels of epitopes. The epitopes in the panel may be 
constructed into one or multiple polypeptides. 

Kits suitable for immunodiagnosis and containing the appropriate labeled reagents are constructed by 
packaging the appropriate materials, including the polypeptides of the invention containing HCV epitopes or 
45 antibodies directed against HCV epitopes in suitable containers, along with the remaining reagents and 
materials required for the conduct of the assay, as well as a suitable set of assay instructions. 

Further Characterization of the HCV Genome, Virions, and Viral Antigens U sing Probes Deriv ed From cONA 
so to the Viral Genome 


The HCV cDNA sequence information in the newly isolated clones described in the Examples may be 
used to gain further information on the sequence of the HCV genome, and for identification and isolation of 
the HCV agent, and thus will aid in its characterization including the nature of the genom , the structure of 
55 the viral particl , and the nature of the antigens of which it is composed. This information, in turn, can lead 
to additional polynucleotid probes, polypeptides derived from the HCV genome, and antibodies directed 
against HCV epitopes which would be useful for the diagnosis and/or treatment of HCV caused NANBH. 

Th cDNA sequence information in th abovementioned clones is useful for the design of prob s for the 
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isolation of additional cDNA sequences which are derived from as yet undefined regions of the HCV 
genome(s) from which the cDNAs in clones described herein and in EP 0,316,218 are derived. For example, 
labeled probes containing a sequence of approximately 8 or more nucleotides, and preferably 20 or more 
nucleotides, which are derived from regions close to the 5 -termini or 3 -termini of the composite HCV 
s cDNA sequence shown in Fig. 17 may be used to isolate overlapping cDNA sequences from HCV cDNA 
libraries. Alternatively, characterization of the genomic segments could be from the viral genome(s) isolated 
from purified HCV particles. Methods for purifying HCV particles and for detecting them during the 
purification procedure are described herein, infra. Procedures for isolating polynucleotide genomes from 
viral particles are known in the art, and one procedure which may be used is that described in EP 
io 0,218,316. The isolated genomic segments could then be cloned and sequenced. An example of this 
technique, which utilizes amplification of the sequences to be cloned, is provided infra., and yielded clone 
16jh. 

Methods for constructing cDNA libraries are known in the art, and are discussed supra and infra; a 
method for the construction of HCV cDNA libraries in lambda-gtll is discussed in EPO Pub. No. 318.216. 
*5 However, cDNA libraries which are useful for screening with nucleic acid probes may also be constructed in 
other vectors known in the art, for example, lambda-gtIO (Huynh et al. (1985)). 


Screening for Anti-Viral Agents for HCV 

20 

The availability of cell culture and animal model systems for HCV makes it possible to screen for anti- 
viral agents which inhibit HCV replication, and particularly for those agents which preferentially allow ceil 
growth and multiplication while inhibiting viral replication. These screening methods are known by those of 
skill in the art. Generally, the anti-viral agents are tested at a variety of concentrations, for their effect on 
25 preventing viral replication in cell culture systems which support viral replication, and then for an inhibition 
of infectivity or of viral pathogenicity (and a low level of toxicity) in an animal model system. 

The methods and compositions provided herein for detecting HCV antigens and HCV polynucleotides 
are useful for screening of anti-viral agents in that they provide an alternative, and perhaps more sensitive 
means, for detecting the agent's effect on viral replication than the cell plaque assay or IDso assay. For 
30 example, the HCV-poly nucleotide probes described herein may be used to quantitate the amount of viral 
nucleic acid produced in a cell culture. This could be accomplished, for example, by hybridization or 
competition hybridization of the infected cell nucleic acids with a labeled HCV-polynucleotide probe. For 
example, also, anti-HCV antibodies may be used to identify and quantitate HCV antigen(s) in the cell culture 
utilizing the immunoassays described herein. In addition, since it may be desirable to quantitate HCV 
35 antigens in the infected cell culture by a competition assay, the polypeptides encoded within the HCV 
cDNAs described herein are useful in these competition assays. Generally, a recombinant HCV polypeptide 
derived from the HCV cDNA would be labeled, and the inhibition of binding of this labeled polypeptide to an 
HCV polypeptide due to the antigen produced in the cell culture system would be monitored. Moreover, 
these techniques are particularly useful in cases where the HCV may be able to replicate in a cell line 
40 without causing cell death. 

The anti-viral agents which may be tested for efficacy by these methods are known in the art. and 
include, for example, those which interact with virion components and/or cellular components which are 
necessary for the binding and/or replication of the virus. Typical anti-viral agents may include, for example, 
inhibitors of virion polymerase and/or protease(s) necessary for cleavage of the precursor polypeptides. 
45 Other anti-viral agents may include those which act with nucleic acids to prevent viral replication, for 
example, anti-sense polynucleotides, etc. 

— Antisense polynucleotides molecules are comprised of a complementary nucleotide sequence which 

allows them to hybridize specifically to ‘designated regions of genomes or~RNAs~Antisense~polynucleotides 
may include, for example, molecules that will block protein translation by binding to mRNA, or may be 
so molecules which prevent replication of viral RNA by transcriptase. They may also include molecules which 
carry agents (non-covalently attached or covalently bound) which cause the viral RNA to be inactive by 
causing, for example, scissions in the viral RNA. They may also bind to cellular polynucleotides which 
enhance and/or are required for viral infectivity, replicative ability, or chronicity. Antisens molecules which 
are to hybridize to HCV derived RNAs may b designed based upon the sequence information of the HCV 
55 cDNAs provided herein. Th antiviral agents based upon anti-sense polynucleotides for HCV may be 
designed to bind with high specificity, to b of increased solubility, to b stable, and to have low toxicity. 
Hence, they may b delivered in specialized systems, for example, liposomes, or by gene therapy. In 
addition, they may include analogs, attached proteins, substituted or altered bonding between bases, etc. 
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Other types of drugs may be based upon polynucleotides which "mimic" important control regions of 
the HCV genome, and which may be therapeutic due to their interactions with key components of the 
system responsible for viral infectivity or replication. 

s 

General Methods 


The general techniques used in extracting the genome from a virus, preparing and probing a cDNA 
library, sequencing clones, constructing expression vectors, transforming cells, performing immunological 
io assays such as radioimmunoassays and ELISA assays, for growing cells in culture, and the like are known 
in the art and laboratory manuals are available describing these techniques. However, as a general guide, 
the following sets forth some sources currently available for such procedures, and for materials useful in 
carrying them out. 

Both prokaryotic and eukaryotic host cells may be used for expression of desired coding sequences 
is when appropriate control sequences which are compatible with the designated host are used. Among 
prokaryotic hosts, E. coli is most frequently used. Expression control sequences for prokaryotes include 
promoters, optionally containing operator portions, and ribosome binding sites. Transfer vectors compatible 
with prokaryotic hosts are commonly derived from, for example, pBR322, a plasmid containing operons 
conferring ampicillin and tetracycline resistance, and the various pUC vectors, which also contain se- 
20 quences conferring antibiotic resistance markers. These markers may be used to obtain successful 
transformants by selection. Commonly used prokaryotic control sequences include the Beta-lactamase 
(penicillinase) and lactose promoter systems (Chang et al. (1977)), the tryptophan (trp) promoter system 
(Goeddel et al. (1980)) and the lambda-derived P L promoter and N gene ribosome binding site (Shimatake 
et al. (1981)) and the hybrid tac promoter (De Boer et al. (1983)) derived from sequences of the trp and lac 
25 UV5 promoters. The foregoing" systems are particularly compatible with E. coli; if desired, other prokaryotic 
hosts such as strains of Bacillus or Pseudomonas may be used, with corresponding control sequences. 

Eukaryotic hosts include yeast and mammalian cells in culture systems. Saccharomyces cerevisiae and 
Saccharomyces carlsbergensis are the most commonly used yeast hosts, and are convenient fungal hosts. 
Yeast compatible vectors carry markers which permit selection of successful transformants by conferring 
30 prototrophy to auxotrophic mutants or resistance to heavy metals on wild-type strains. Yeast compatible 
vectors may employ the 2 micron origin of replication (Broach et al. (1983)), the combination of CEN3 and 
ARS1 or other means for assuring replication, such as sequences which will result in incorporation of an 
appropriate fragment into the host cell genome. Control sequences for yeast vectors are known in the art 
and include promoters for the synthesis of glycolytic enzymes (Hess et al. (1968); Holland et al. (1978)), 
35 including the promoter for 3 phosphoglycerate kinase (Hitzeman (1980)). Terminators may also be included, 
such as those derived from the enolase gene (Holland (1981)). Particularly useful control systems are those 
which comprise the glyceraldehyde-3 phosphate dehydrogenase (GAPDH) promoter or alcohol de- 
hydrogenase (ADH) regulatable promoter, terminators also derived from GAPDH, and if secretion is desired, 
leader sequence from yeast alpha factor. In addition, the transcriptional regulatory region and the transcrip- 
40 tional initiation region which are operably linked may be such that they are not naturally associated in the 
wild-type organism. These systems are described in detail in EPO 120,551, published October 3, 1984; 
EPO 116,201, published August 22, 1984; and EPO 164,556, published December 18. 1985, all of which are 
assigned to the herein assignee, and are hereby incorporated herein by reference. 

Mammalian cell lines available as hosts for expression are known in the art and include many 
45 immortalized cell lines available from the American Type Culture Collection (ATCC), including HeLa cells, 
Chinese hamster ovary (CHO) ceils, baby hamster kidney (BHK) cells, and a number of other cell lines. 
Suitable promoters for mammalian cells are also known in the art and include viral promoters such as that 
from Simian Virus 40 (SV40) (Fiers (1978)), Rous sarcoma virus (RSV). adenovirus (ADV), and bovine 
papilloma virus (BPV). Mammalian cells may also require terminator sequences and poly A addition 
so sequences; enhancer sequences which increase expression may also be included, and sequences which 
cause amplification of the gene may also be desirable. These sequences are known in the art. Vectors 
suitable for replication in mammalian cells may include viral replicons, or sequences which insure 
integration of the appropriate sequences encoding NANBV epitopes into the host genome. 

Transformation may b by any known method for introducing polynucleotides into a host cell, including, 
55 for example packaging the polynucleotide in a virus and transducing a host cell with the virus, and by direct 
uptake of the polynucleotide. The transformation procedure used depends upon the host to be transformed. 
For example, transformation of the E. coli host cells with lambda-gtl 1 containing BB-NANBV sequences is 
discussed in the Example section, infra. Bacterial transformation by direct uptake generally employs 
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treatment with calcium or rubidium chloride (Cohen (1972); Maniatis (1982)). Yeast transformation by direct 
uptake may be carried out using the method of Hinnen et al. (1978). Mammalian transformations by direct 
uptake may be conducted using the calcium phosphate precipitation method of Graham and Van der Eb 
(1978). or the various known modifications thereof. 

s Vector construction employs techniques which are known in the art. Site-specific ONA cleavage is 
performed by treating with suitable restriction enzymes under conditions which generally are specified by 
the manufacturer of these commercially available enzymes. In general, about 1 microgram of plasmid or 
ON A sequence is cleaved by 1 unit of enzyme in about 20 microiiters buffer solution by incubation of 1-2 hr 
at 37 *C. After incubation with the restriction enzyme, protein is removed by phenol/chloroform extraction 
io and the DNA recovered by precipitation with ethanol. The cleaved fragments may be separated using 
polyacrylamide or agarose gel electrophoresis techniques, according to the general procedures found in 
Methods in Enzymology (1 980) 65:499-560. 

Sticky ended cleavage fragments may be blunt ended using E. coN DNA polymerase I (Klenow) in the 
presence of the appropriate deoxynucleotide triphosphates (dNTPs) present in the mixture. Treatment with 
75 Si nuclease may also be used, resulting in the hydrolysis of any single stranded DNA portions. 

Ligations are carried out using standard buffer and temperature conditions using T4 DNA ligase and 
ATP; sticky end ligations require less ATP and less ligase than blunt end ligations. When vector fragments 
are used as part of a ligation mixture, the vector fragment is often treated with bacterial alkaline 
phosphatase (BAP) or calf intestinal alkaline phosphatase to remove the 5 -phosphate and thus prevent 
20 religation of the vector; alternatively, restriction enzyme digestion of unwanted fragments can be used to 
prevent ligation. 

Ligation mixtures are transformed into suitable cloning hosts, such as E. coli , and successful transfor- 
mants selected by, for example, antibiotic resistance, and screened for the correct construction. 

Synthetic oligonucleotides may be prepared using an automated oligonucleotide synthesizer as de- 
ss scribed by Warner (1984). If desired the synthetic strands may be labeled with 32 P by treatment with 
polynucleotide kinase in the presence of 32 P-ATP, using standard conditions for the reaction. 

DNA sequences, including those isolated from cDNA libraries, may be modified by known techniques, 
including, for example site directed mutagenesis, as described by Zoller (1982). Briefly, the DNA to be 
modified is packaged into phage as a single stranded sequence, and converted to a double stranded DNA 
30 with DNA polymerase using, as a primer, a synthetic oligonucleotide complementary to the portion of the 
DNA to be modified, and having the desired modification included in its own sequence. The resulting 
double stranded DNA is transformed into a phage supporting host bacterium. Cultures of the transformed 
bacteria, which contain replications of each strand of the phage, are plated in agar to obtain plaques. 
Theoretically. 50% of the new plaques contain phage having the mutated sequence, and the remaining 50% 
as have the original sequence. Replicates of the plaques are hybridized to labeled synthetic probe at 
temperatures and conditions which permit hybridization with the correct strand, but not with the unmodified 
sequence. The sequences which have been identified by hybridization are recovered and cloned. 

DNA libraries may be probed using the procedure of Grunstein and Hogness (1975). Briefly, in this 
procedure, the DNA to be probed is immobilized on nitrocellulose filters, denatured, and prehybridized with 
40 a buffer containing 0-50% formamide, 0.75 M NaCI, 75 mM Na citrate, 0.02% (wt/v) each of bovine serum 
albumin, polyvinyl pyrollidone, and Rcoll, 50 mM Na Phosphate (pH 6.5), 0.1% SDS, and 100 
micrograms/ml carrier denatured DNA. The percentage of formamide in the buffer, as well as the time and 
temperature conditions of the prehybridization and subsequent hybridization steps depends on the strin- 
gency required. Oligomeric probes which require lower stringency conditions are generally used with low 
45 percentages of formamide, lower temperatures, and longer hybridization times. Probes containing more than 
30 or 40 nucleotides such as those derived from cDNA or genomic sequences generally employ higher 
— temperatures.~e.a.ahoutj4fr42 # C, and a hig h perce ntage , e.g., 50%, formamide. Following prehybridiza- 
tion, s'-^P-labeled oligonucleotide probe is added to the buffer7^^ th^filt^s _ a^e'incubated in this mixture 
under hybridization conditions. After washing, the treated filters are subjected to autoradiography to show 
so the location of the hybridized probe; DNA in corresponding locations on the original agar plates is used as 
the source of the desired DNA. 

For routine vector constructions, ligation mixtures are transformed into E. coli strain HB101 or other 
suitable host, and successful transformants selected by antibiotic resistance or other markers. Plasmids 
from the transformants are then prepared according to the method of Clewell et al. (1969), usually following 
55 chloramphenicol amplification (Clewell (1972)). The DNA is isolated and analyzed, usually by restriction 
enzyme analysis and/or sequencing. Sequencing may b by the dideoxy method of Sanger et al. (1977) as 
furth r described by Messing et al. (1981), or by the method of Maxam et aJ. (1980). Problems with band 
compression, which are sometimes observed in GC rich regions, were overcom by use of T- 
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deazoguanosine according to Barr et al. (1986). 

The enzyme-linked immunosorbent assay (ELISA) can be used to measure either antigen or antibody 
concentrations. This method depends upon conjugation of an enzyme to either an antigen or an antibody, 
and uses the bound enzyme activity as a quantitative label. To measure antibody, the known antigen is 
s fixed to a solid phase (e.g., a microplate or plastic cup), incubated with test serum dilutions, washed, 
incubated with anti-immunoglobulin labeled with an enzyme, and washed again. Enzymes suitable for 
labeling are known in the art, and include, for example, horseradish peroxidase. Enzyme activity bound to 
the solid phase is measured by adding the specific substrate, and determining product formation or 
substrate utilization colorimetrically. The enzyme activity bound is a direct function of the amount of anti- 
70 body bound. 

To measure antigen, a known specific antibody is fixed to the solid phase, the test material containing 
antigen is added, after an incubation the solid phase is washed, and a second enzyme-labeled antibody is 
added. After washing, substrate is added, and enzyme activity is estimated colorimetrically, and related to 

antigen concentration. 

75 

Examples 


20 Described below are examples of the present invention which are provided only for illustrative purposes, 
and not to limit the scope of the present invention. In light of the present disclosure, numerous 
embodiments within the scope of the claims will be apparent to those of ordinary skill in the art. 

25 Isolation and Sequence of Overlapping HCV cDNA Clones 13i, 26j, CA59a, CA84a, CA1 56e and CA1 67b 


The clones 13i, 26j, CA59a, CA84a, CA156e and CA167b were isolated from the lambda-gtll library 
which contains HCV cDNA (ATCC No. 40394), the preparation of which is described in EPO Pub. No. 
30 318.216 (published 31 May 1989), and WO 89/04669 (published 1 June 1989). Screening of the library was 
with the probes described infra., using the method described in Huynh (1985). The frequencies with which 
positive clones appeared with the respective probes was about 1 in 50,000. 

The isolation of clone 13i was accomplished using a synthetic probe derived from the sequence of 
clone 1 2f . The sequence of the probe was: 

35 s' GAA CGT TGC GAT CTG GAA GAC AGG GAC AGG 3'. 

The isolation of clone 26j was accomplished using a probe derived from the 5 -region of clone K9-1. 

The sequence of the probe was: 

5' TAT CAG TTA TGC CAA CGG AAG CGG CCC CGA 3'. 

The isolation procedures for clone 12f and for clone k9-1 (also called K9-1) are described in EPO Pub. 
40 No. 318,216, and their sequences are shown in Figs. 1 and 2. respectively. The HCV cDNA sequences of 
clones 13i and 26j, are shown in Figs. 4 and 5, respectively. Also shown are the amino acids encoded 
therein, as well as the overlap of clone 13i with clone 12f, and the overlap of clone 26j with clone 13i. The 
sequences for these clones confirmed the sequence of clone K9-1. Clone K9-1 had been isolated from a 
different HCV cDNA library (See EP 0,218,316). 

45 Clone CA59a was isolated utilizing a probe based upon the sequence of the 5, -region of clone 26j. The 
sequence of this probe was: 

5 CTG GTT AGC AGG GCT TTT CTA TCA CCA CAA 3. 

A probe derived from the sequence of clone CA59a was used to isolate clone CA84a. The sequence of 
the probe used for this isolation was: 
so 5' AAG GTC CTG GTA GTG CTG CTG CTA TTT GCC 3 '. 

Clone CAl56e was isolated using a probe derived from the sequence of clone CA84a. The sequence of 

the probe was: 

5 # ACT GGA CGA CGC AAG GTT GCA ATT GCT CTA 3'. 

Clone CA1 67b was isolated using a probe derived from the sequence of clone CA 1 56p. The sequence 

55 of the probe was: 

5 TTC GAC GTC ACA TCG ATC TGC TTG TCG GGA 3'. 

The nucleotide sequences of the HCV cDNAs in clones CA59a, CA84a, CA156e, and CA167b, are 
shown Figs. 6, 7, 8, and 9. respectiv ly. The amino acids encoded therein, as well as th overlap with the 
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sequences of relevant clones, are also shown in the Figs. 


Creation of "pi" HCV cDNA Library 


A library of HCV cDNA, the "pi" library, was constructed from the same batch of infectious chimpanzee 
plasma used to construct the lambda-gtl 1 HCV cDNA library (ATCC No. 40394) described in EPO Pub. No. 
318,216, and utilizing essentially the same techniques. However, construction of the pi library utilized a 
primer-extension method, in which the primer for reverse transcriptase was based on the sequence of clone 
CA59A. The sequence of the primer was: 

5' GGT GAC GTG GGT TTC 3'. 


Isolation and Sequence of Clone pi 14a 


Screening of the ”pi" HCV cDNA library described supra., with the probe used to isolate clone CA167b 
(See supra.) yielded clone pi 1 4a. The clone contains about 800 base pairs of cDNA which overlaps clones 
CAl67b. CA156e. CA84a and CA59a, which were isolated from the lambda gt-1 1 HCV cDNA library (ATCC 
No. 40394). In addition, pi 14a also contains about 250 base pairs of DNA which are upstream of the HCV 
cDNA in clone CA1 67b. 


Isolation and Sequence of Clones CA216a, CA290a and ag30a 


Based on the sequence of clone CAl67b a synthetic probe was made having the following sequence: 

5 GGC TTT ACC ACG TCA CCA ATG ATT GCC CTA 3 

The above probe was used to screen the lambda-gtl 1 library (ATCC No. 40394), which yielded clone 

CA216a, whose HCV sequences are shown in Fig. 10. 

Another probe was made based on the sequence of clone CA216a having the following sequence: 

5' TTT GGG TAA GGT CAT CGA TAC CCT TAC GTG 3 

Screening the lambda-gtl 1 library (ATCC No. 40394) with this probe yielded clone CA290a, the HCV 
sequences therein being shown in Fig. 1 1 . 

In a parallel approach, a primer-extension cDNA library was made using nucleic acid extracted from the 
same infectious plasma used in the original lambda-gtl 1 cDNA library described above. The primer used 
was based on the sequence of clones CA2l6a and CA290a: 

5' GAA GCC GCA CGT AAG 3' 

The cDNA library was made using methods similar to those described previously for libraries used in the 
isolation of clones pi 14a and k9-1. The probe used to screen this library was based on the sequence of 

clone CA290a: 

5' CCG GCG TAG GTC GCG CAA TTT GGG TAA 3' 

Clone ag30a was isolated from the new library with the above probe, and contained about 670 basepairs of 
HCV sequence. See Fig. 12. Part of this sequence overlaps the HCV sequence of clones CA216a and 
CA290a. About 300 base-pairs of the ag30a sequence, however, is upstream of the sequence from clone 
CA290a. The non-overlapping sequence shows a start codon f) and stop codons that may indicate the start 

of the HC\TORF. Also Indicated in Fig^l^ are- putative small -encoded peptides (#) which_may_p]ay_AroleJn 

regulating translation, as well as the putative first amino acid of the putative polypeptide (/), and downstream 
amino acids encoded therein. 


Isolation and Sequenc of Clone CA205a 


Clone CA205a was isolated from .the original lambda gt-1 1 library (ATCC No. 40394), using a synthetic 
probe derived from the HCV sequence in clone CA290a ^Fig. 1 1). Th sequence of the prob was: 

5' TCA GAT CGT TGG TGG AGT TTA CTT GTT GCC 3 . 
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The sequence of the HCV cDNA in CA205a, shown in Fig. 13, overlaps with the cDNA sequences in both 
clones ag30a and CA290a. The overlap of the sequence with that of CA290a is shown by the dotted line 
above the sequence (the figure also shows the putative amino acids encoded in this fragment). 

As observed from the HCV cDNA sequences in clones CA205a and ag30a, the putative HCV 
5 polyprotein appears to begin at the ATG start codon; the HCV sequences in both clones contain an in- 
frame, contiguous double stop codon (TGATAG) forty two nucleotides upstream from this ATG. The HCV 
ORF appears to begin after these stop codons, and to extend for at least 8907 nucleotides (See the 
composite HCV cDNA shown in Fig. 17). 

70 

Isolation and Sequence of Clone 1 8g 


Based on the sequence of clone ag30a (See Fig. 12) and of an overlapping clone from the original 
75 lambda gt-11 library (ATCC No. 40394), CA230a, a synthetic probe was made having the following 
sequence: 

5 CCA TAG TGG TCT GCG GAA CCG GTG AGT ACA 3. 

Screening of the original lambda-gtll HCV cDNA library with the probe yielded clone 18g, the HCV cDNA 
sequence of which is shown in Fig. 14. Also shown in the figure are the overlap with clone ag30a, and 
20 putative polypeptides encoded within the HCV cDNA. 

The cDNA in clone 18g (C18g or 18g) overlaps that in clones ag30a and CA205a, described supra. The 
sequence of C18g also contains the double stop codon region observed in clone ag30a The polynucleotide 
region upstream of these stop codons presumably represents part of the 5 -region of the HCV genome, 
which may contain short ORFs, and which can be confirmed by direct sequenc ing of the purified HCV 
25 genome. These putative small encoded peptides may play a regulatory role in translation. The region of the 
HCV genome upstream of that represented by C18g can be isolated for sequence analysis using essentially 
the technique described in EPO Pub. No. 318.216 for isolating cDNA sequences upstream of the HCV 
cDNA sequence in clone 12f. Essentially, small synthetic oligonucleotide primers of reverse transcriptase, 
which are based upon the sequence of C18g, are synthesized and used to bind to the corresponding 
30 sequence in HCV genomic RNA. The primer sequences are proximal to the known 5 -terminal of C18g, but 
sufficiently downstream to allow the design of probe sequences upstream of the primer sequences. Known 
standard methods of priming and cloning ar eused. The resulting cDNA libraries are screened with 
sequences upstream of the priming sites (as deduced from the elucidated sequence of C18g). The HCV 
genomic RNA is obtained from either plasma or liver samples from individuals with NANBH. Since HCV 
35 appears to be a Flavi-like virus, the 5 -terminus of the genome may be modified with a "cap** structure. It is 
known that Flavivirus genomes contain 5 -terminal "cap" structures. (Yellow Fever virus, Rice et al. (1988); 
Dengue virus, Hahn et al (1988); Japanese Encephalitis Virus (1987)). 

40 Isolation and Sequence of Clones from the beta-HCV cDNA library 


Clones containing cDNA representative of the 3 -terminal region of the HCV genome were isolated from 
a cDNA library constructed from the original infectious chimpanzee plasma pool which was used for the 
45 creation of the HCV cDNA lambda-gtll library (ATCC No. 40394), described In EPO Pub. No. 318,216. In 
order to create the DNA library, RNA extracted from the plasma was "tailed" with poly rA using poly (rA) 
polymerase, and cDNA was synthesized using oligo(dT)i 2 -is as a primer for reverse transcriptase. The 
resulting RNArcDNA hybrid was digested with RNAase H, and converted to double stranded HCV cDNA. 
The resulting HCV cDNA was cloned into lambda-gtIO, using essentially the technique described in Huynh 
so (1985), yielding the beta (or b) HCV cDNA library. The procedures used were as follows. 

An aliquot (12ml) of the plasma was treated with proteinase K. and extracted with an equal volume of 
phenol saturated with 0.05M Tris-CI, pH 7.5, 0.05% (v/v) beta-mercaptoethanol, 0.1% (w/v) hydrox- 
yquinolone, 1 mM EDTA. The resulting aqueous phase was re-extracted with the phenol mixture, followed 
by 3 extractions with a 1:1 mixture containing phenol and chloroformilsoamyl alcohol (24:1), followed by 2 
55 extractions with a mixture of chloroform and isoamyl alcohol (1:1). Subsequent to adjustment of th aqueous 
phase to 200 mM with respect to NaCI, nucleic acids in the aqueous phase wer pr cipitated overnight at 
-20 *C, with 2.5 volumes of cold absolute ethanol. The precipitat s were collected by centrifugation at 
10,000 RPM for 40 min., washed with 70% ethanol containing 20 mM NaCI, and with 100% cold ethanol. 
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dried for 5 min. in a dessicator, and dissolved in water. 

The isolated nucleic acids from the infectious chimpanzee plasma pool were tailed with poly rA utilizing 
poly-A polymerase in the presence of human placenta ribonuclease inhibitor (HPRI) (purchased from 
Amersham Corp.) f utilizing MS2 RNA as carrier. Isolated nucleic acids equivalent to that in 2 ml of plasma 
5 were incubated in a solution containing TMN (50 mM Tris HCI, pH 7.9, 10 mM MgCb, 250 mM NaCI, 2.5 
mM MnCI 2l 2 mM dithiothreitol (DTT)), 40 micromolar alpha-pP] ATP, 20 units HPRI (Amersharn Corp.), 
and about 9 to 10 units of RNase free poly-A polymerase (BRL). Incubation was for 10 min. at 37 C, and 
the reactions were stopped with EDTA (final concentration about 250 mM). The solution was extracted with 
an equal volume of phenol-chloroform, and with an equal volume of chloroform, and nucleic acids were 
w precipitated overnight at -20* C with 2.5 volumes of ethanol in the presence of 200 mM NaCI. 


Isolation of Clone b5a 


75 

The beta HCV cDNA library was screened by hybridization using a synthetic probe, which had a 
sequence based upon the HCV cDNA sequence in clone 15e. The isolation of clone 15e is described in 
EPO Pub. No. 318,216, and its sequence is shown in Fig. 3. The sequence of the synthetic probe was: 

5 ATT GCG AGA TCT ACG GGG CCT GCT ACT CCA 3'. 

20 Screening of the library yielded clone beta-5a (b5a), which contains an HCV cDNA region of approximately 
1000 base pairs. The 5 -region of this cDNA overlaps clones 35f, 19g, 26g, and 15e (these clones are 
described supra). The region between the 3 -terminal poly-A sequence and the 3 -sequence which overlaps 
clone 15e, contains approximately 200 base pairs. This clone allows the identification of a region of the 3 - 
terminal sequence the HCV genome. 

25 The sequence of b5a is contained within the sequence of the HCV cDNA in clone 1 6jh (described infra). 
Moreover, the sequence is also present in CC34a, isolated from the original lambda-gtl 1 library (ATCC No. 
40394). (The original lambda-gtl 1 library is referred to herein as the "C" library). 

30 Isolation and Sequence of Clones Generated by PCR Amplification of the 3 -Region of the HCV Genome 


Multiple cDNA clones have been generated which contain nucleotide sequences derived from the 3 - 
region of the HCV genome. This was accomplished by amplifying a targeted region of the genome by a 
35 polymerase chain reaction technique described in Saiki et al. (1986), and in Saiki et al. (1988), which was 
modified as described below. The HCV RNA which was amplified was obtained from the original infectious 
chimpanzee plasma pool which was used for the creation of the HCV cDNA lambda-gtl 1 library (ATCC No. 
40394) described in EPO Pub. No. 318,216. Isolation of the HCV RNA was as described supra. The isolated 
RNA was tailed at the 3 -end with ATP by E. coli poly-A polymerase as described in Sippel (1973), except 
40 that the nucleic acids isolated from chimp serum were substituted for the nucleic acid substrate. The tailed 
RNA was then reverse transcribed into cDNA by reverse transcriptase, using an oligo dT-primer adapter, 
essentially as described by Han (1987), except that the components and sequence of the primer-adapter 

were: 


Stuffer 

Notl 

SP6 Promoter 

Primer 

_AATTC 

GCGGCCGC 

CATACG ATTTAGGT G ACACT AT AGAA 

T 1 5 


The resultant cDNA was subjected to amplification by PCR using two primers: 


Primer 

Sequence 

JH32 (30mer) 
JH11 (20mer) 

ATAGCGGCCGCCCTCGATTGCGAGATCTAC 

AATTCGGGCGGCCGCCATACGA 
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The JH32 primer contained 20 nucleotide sequences hybridizabie to the s'-end of the target region in the 
cDNA, with an estimated T m of 66* C. The JH1 1 was derived from a portion of the oligo dT-primer adapter; 
thus, it is specific to the 3’-end of the cDNA with a T m of 64* C. Both primers were designed to have a 
recognition site for the restriction enzyme, Notl, at the s'-end, for use in subsequent cloning of the amplified 
5 HCV cDNA. 

The PCR reaction was carried out by suspending the cDNA and the primers in 100 microliters of 
reaction mixture containing the four deoxy nucleoside triphosphates, buffer salts and metal ions, and a 
thermostable DNA polymerase isolated from Thermus aquaticus (Taq polymerase), which are in a Perkin 
Elmer Cetus PCR kit (N801-0043 or N801-0055). The PCR reaction was performed for 35 cycles in a Perkin 
io Elmer Cetus DNA thermal cycler. Each cycle consisted of a 1.5 min denaturation step at 94° C, an 
annealing step at 60* C for 2 min, and a primer extension step at 72* C for 3 min. The PCR products were 
subjected to Southern blot analysis using a 30 nucleotide probe, JH34, the sequence of which was based 
upon that of the 3 -terminal region of clone 15e. The sequence of JH34 is: 

5' CTT GAT CTA CCT CCA ATC ATT CAA AGA CTC 3 . 
is The PCR products detected by the HCV cDNA probe ranged in size from about 50 to about 400 base pairs. 

In order to clone the amplified HCV cDNA, the PCR products were cleaved with Notl and size selected 
by polyacrylamide gel electrophoresis. DNA larger than 300 base pairs was cloned into the Not! site of 
pUC18S The vector pUC18S is constructed by including a Notl polylinker cloned between the EcoR! and 
Sail sites of pUC18. The clones were screened for HCV cDNA using the JH34 probe. A number of positive 
so clones were obtained and sequenced. The nucleotide sequence of the HCV cDNA insert in one of these 
clones, 16jh, and the amino acids encoded therein, are shown in Fig. 15. A nucleotide heterogeneity, 
detected in the sequence of the HCV cDNA in clone 16jh as compared to another clone of this region, is 
indicated in the figure. 

25 

Compiled HCV cDNA Sequences 


An HCV cDNA sequence has been compiled from a series of overlapping clones derived from the 
30 various HCV cDNA libraries described supra.. In this sequence, the compiled HCV cDNA sequence 
obtained from clones b114a, I8g, ag30a, CA205a, CA290a, CA216a, pi14a, CA167b, CA156e, CA84a, and 
CA59a is upstream of the compiled HCV cDNA sequence published in EPO Pub. No. 318,216, which is 
shown in Fig. 16. The compiled HCV cDNA sequence obtained from clones b5a and 16jh downstream of 
the compiled HCV cDNA sequence published in EPO Pub. No. 318.216. 

35 Fig. 17 shows the compiled HCV cDNA sequence derived from the above-described clones and the 
compiled HCV cDNA sequence published in EPO Pub. No. 318,216. The clones from which the sequence 
was derived are b114a, 18g, ag30a, CA205a. CA290a, CA216a, pi 14a, CA167b. CA156e, CA84a. CA59a, 
K9-1 (also called k9-1),26j, 13i, 12f, 14i, 11b, 7f, 7e, 8h, 33c, 40b, 37b, 35, 36, 81. 32, 33b, 25c, 14c, 8f, 33f, 
33g, 39c, 35f, I9g, 26g. 15e, b5a, and 16jh. In the figure the three dashes above the sequence indicate the 

40 position of the putative initiator methionine codon. 

Clone bl 14a was obtained using the cloning procedure described for clone b5a, supra., except that the 
probe was the synthetic probe used to detect clone 18g, supra. Clone b114a overlaps with clones 18g, 
ag30a, and CA205a, except that clone bl14a contains an extra two nucleotides upstream of the sequence in 
clone 18g (i.e., 5 -CA). These extra two nucleotides have been included in the HCV genomic sequence 
45 shown in Fig. 17. 

It should be noted that although several of the clones described supra, have been obtained from 
libraries other than the original HCV cDNA lambda-gtl 1 C library (ATCC No. 40394), these clones contain 
HCV cDNA sequences which overlap HCV cDNA sequences in the original library. Thus, essentially all of 
the HCV sequence is derivable from the original lambda-gtl 1 C library (ATCC No. 40394) which was used 
so to isolate the first HCV cDNA clone (5-1-1). The isolation of clone 5-1-1 is described in EPO Pub. No. 
318,216. 


Purification of Fusion polypeptide Cl 00-3 (Alternate method) 
ss 


The fusion polypeptide. Cl 00-3 (also called HCV cl 00-3 and alternatively, cl 00-3), is comprised of 
superoxide dismutase (SOD) at the N-terminus an in-frame Cl 00 HCV polypeptide at the C-terminus. A 
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method for preparing the polypeptide by expression in yeast, and differential extraction of the insoluble 
fraction of the extracted host yeast cells, is described in EPO Pub. No. 318,216. An alternate method for 
the preparation of this fusion polypeptide is described below. In this method the antigen is precipitated from 
the crude cell lysate with acetone; the acetone precipitated antigen is then subjected to ion-exchange 

s chromatography, and further* purified by gel filtration. 

The fusion polypeptide, Cl 00-3 (HCV cl 00-3), is expressed in yeast strain JSC 308 (ATCC No. 20879) 
transformed with pAB24C100-3 (ATCC No. 67976); the transformed yeast are grown under conditions which 
allow expression (i.e., by growth in YEP containing 1% glucose). (See EPO Pub. No. 318,216). A cell lysate 
is prepared by suspending the cells in Buffer A (20 mM Tris HCI, pH 8.0, 1 mM EDTA, 1 mM PMSF. The 
io cells are broken by grinding with glass beads in a Dynomill type homogenizer or its equivalent. The extent 
of cell breakage is monitored by counting ceils under a microscope with phase optics. Broken cells appear 
dark, while viable cells are light-colored. The percentage of broken cells is determined. 

When the percentage of broken cells is approximately 90% or greater, the broken cell debris is 
separated from the glass beads by centrifugation, and the glass beads are washed with Buffer A. After 
/5 combining the washes and homogenate, the insoluble material in the lysate is obtained by centrifugation. 
The material in the pellet is washed to remove soluble proteins by suspension in Buffer B (50 mM glycine, 
pH 12.0, 1 mM DTT, 500 mM NaCl), followed by Buffer C (50 mM glycine, pH 10.0, 1 mM DTT). The 
insoluble material is recovered by centrifugation, and solubilized by suspension in Buffer C containing SDS. 
The extract solution may be heated in the presence of beta-mercaptoethanol and concentrated by 
20 ultrafiltration. The HCV cl 00-3 in the extract is precipitated with cold acetone. If desired, the precipitate may 
be stored at temperatures at about or below -15 C. 

Prior to ion exchange chromatography, the acetone precipitated material is recovered by centrifugation, 
and may be dried under nitrogen. The precipitate is suspended in Buffer D (50 mM glycine, pH 10.0, 1 mM 
DTT, 7 M urea), and centrifuged to pellet insoluble material. The supernatant material is applied to an anion 
25 exchange column previously equilibrated with Buffer D. Fractions are collected and analyzed by ultraviolet 
absorbance or gel electrophoresis on SDS polyacrylamide gels. Those fractions containing the HCV cl 00-3 
polypeptide are pooled. 

In order to purify the HCV cl 00-3 polypeptide by gel filtration, the pooled fractions from the ion- 
exchange column are heated in the presence of beta-mercaptoethanol and SDS, and the eluate is 
30 concentrated by ultrafiltration. The concentrate is applied to a gel filtration column previously equilibrated 
with Buffer E (20 mM Tris HCI, pH 7.0. 1 mM DTT, 0.1% SDS). The presence of HCV c100-3 in the eluted 
fractions, as well as the presence of impurities, are determined by gel electrophoresis on polyacrylamide 
gels in the presence of SDS and visualization of the polypeptides. Those fractions containing purified HCV 
cl 00-3 are pooled. Fractions high in HCV c100-3 may be further purified by repeating the gel filtration 
as process. If the removal of particulate material is desired, the HCV cl 00-3 containing material may be filtered 
through a 0.22 micron filter. 


Expression and Antigenicity of Polypeptides Encoded in HCV cDNA 

40 


Polypeptides Expressed in E. coli 


45 

The polypeptides encoded in a number of HCV cDNAs which span the HCV genomic ORF were 
— —expressed in -E.-coli, and, tested for their antig enici ty using serum obtained from a variety of individuals with 
NANBH. The”expres$ion vectors containing the cloned~ HCV“cDNAs were constructed from pSODcfl 
(Steimer et at. (1986). In order to be certain that a correct reading frame would be achieved, three separate 
so expression vectors. pcflAB, pcflCD, and pcflEf were created by ligating either of three linkers, AB, CD, 
and EF to a BamHI-EcoRI fragment derived by digesting to completion the vector pSODcfl with EcoRI and 
BamHI, followed by treatment with alkaline phosphatase. The linkers were created from six oligomers. A, B, 
C, D, E, and F. Each oligomer was phosphorylated by treatment with kinase in th presence of ATP prior to 
annealing to its complem ntary oligomer. The sequences of the synthetic linkers were the following. 

55 
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Name 


DNA Sequence ( 5 ' to 3 r ) 


5 


10 


15 


A 

GATC 

CTG 

AAT 

TCC 

TGA 

TAA 



B 


GAC 

TTA 

AGG 

ACT 

ATT 

TTA 

A 

C 

GATC 

CGA 

ATT 

CTG 

TGA 

TAA 



D 


GCT 

TAA 

GAC 

ACT 

ATT 

TTA 

A 

E 

GATC 

CTG 

GAA 

TTC 

TGA 

TAA 



F 


GAC 

CTT 

AAG 

ACT 

ATT 

TTA 

A 


Each of the three linkers destroys the original EcoRI site, and creates a new EcoRI site within the linker, but 
within a different reading frame. Hence, the HCV cDNA EcoRI fragments isolated from the clones when 
inserted into the expression vector, were in three different reading frames. 

20 The HCV cDNA fragments in the designated lambda-gtl 1 clones were excised by digestion with EcoRI; 
each fragment was inserted into pcflAB, pcfICD, and pcflEF. These expression constructs were then 
transformed into D1210 E. coif cells, the transformants were cloned, and recombinant bacteria from each 
clone were induced to express the fusion polypeptides by growing the bacteria in the presence of IPTG. 

Expression products of the indicated HCV cDNAs were tested for antigenicity by direct immunological 
25 screening of the colonies, using a modification of the method described in Helfman et al. (1983). Briefly, as 
shown in Fig. 18, the bacteria were plated onto nitrocellulose filters overlaid on ampicillin plates to give 
approximately 1,000 colonies per filter. Colonies were replica plated onto nitrocellulose filters, and the 
replicas were regrown overnight in the presence of 2 mM IPTG and ampicillin. The bacterial colonies were 
lysed by suspending the nitrocellulose filters for about 15 to 20 min in an atmosphere saturated with CHCI 3 
30 vapor. Each filter then was placed in an individual 100 mm Petri dish containing 10 ml of 50 mM Tris HO. 
pH 7.5, 150 mM NaCI. 5 mM MgCh, 3% (w/v) BSA, 40 micrograms/ml lysozyme, and 0.1 microgram/ml 
DNase. The plates were agitated gently for at least 8 hours at room temperature. The filters were rinsed in 
TBST (50 mM Tris HCI, pHS.O, 150 mM NaCI, 0.005% Tween 20). After incubation, the cell residues were 
rinsed and incubated in TBS (TBST without Tween) containing 10% sheep serum; incubation was for 1 
35 hour. The filters were then incubated with pretreated sera in TBS from individuals with NANBH, which 
included: 3 chimpanzees; 8 patients with chronic NANBH whose sera were positive with respect to 
antibodies to HCV Cl 00-3 polypeptide (described in EPO Pub. No. 318,216, and supra.) (also called Cl 00); 
8 patients with chronic NANBH whose sera were negative for anti-ClOO antibodies; a convalescent patient 
whose serum was negative for anti-ClOO antibodies; and 6 patients with community acquired NANBH, 
40 including one whose sera was strongly positive with respect to anti-ClOO antibodies, and one whose sera 
was marginally positive with respect to anti-ClOO antibodies. The sera, diluted in TBS. was pretreated by 
preabsorption with hSOD. Incubation of the filters with the sera was for at least two hours. After incubation, 
the filters were washed two times for 30 min with TBST. Labeling of expressed proteins to which antibodies 
in the sera bound was accomplished by incubation for 2 hours with ,25 Mabeled sheep anti-human antibody. 
45 After washing, the filters were washed twice for 30 min with TBST, dried, and autoradiographed. 

A number of clones (see infra.) expressed polypeptides containing HCV epitopes which were im 
munologically reactive with serum from individuals with NANBH. Rve of these polypeptides were very 
immunogenic in that antibodies to HCV epitopes in these polypeptides were detected in many different 
patient sera. The clones encoding these polypeptides, and the location of the polypeptide in the putative 
50 HCV polyprotein (wherein the amino acid numbers begin with the putative initiator codon) are the following: 
clone 5-1-1, amino acids 1694-1735; clone C100, amino acids 1569-1931; clone 33c, amino acids 1192- 
1457; clone CA279a, amino acids 1-84; and clone CA290a amino acids 9-177. The location of the 
immunogenic polypeptides within the putative HCV polyprotein are shown immediately below. 
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Clones encoding polypeptides of proven 
reactivity with sera from NANBH patients. 

Clone 

Location within the HCV 


poly protein 


(amino acid no. beginning with 


putative initiator methionine) 

CA279a 

1-84 

CA74a 

437-582 

13i 

511-690 

CA290a 

9-177 

33c 

1192-1457 

40b 

1266-1428 

5-1-1 

1694-1735 

81 

1689-1805 

33b 

1916-2021 

25c 

1949-2124 

14c 

2054-2223 

8f 

2200-3325 

33f 

2287-2385 

33g 

2348-2464 

39c 

2371-2502 

15e 

2796-2886 

Cl 00 . 

1569-1931 


The results on the immunogenicity of the polypeptides encoded in the various clones examined 
suggest efficient detection and immunization systems may include panels of HCV polypeptides/epitopes. 


Expression of HCV Epitopes in Yeast 


Three different yeast expression vectors which allow the insertion of HCV cDNA into three different read 
ing frames are constructed. The construction of one of the vectors, pAB24C1Q0-3 is described in EPO Pub. 
No. 318,216. In the studies below, the HCV cDNA from the clones listed in supra, in the antigenicity 
mapping study using the E. coli expressed products are substituted for the Cl 00 HCV cDNA. The 
construction of the other vectors replaces the adaptor described in the above E. coli studies with one of the 
following adaptors: 


Adaptor 1 


ATT TTG AAT TCC TAA TGA G 

ACTTA^AGGATT ACT CAG G-T-- 


Adaptor 2 


AAT TTG GAA TTC TAA TGA G 

AC CTT AAG ATT ACT CAG CT. 


The inserted HCV cDNA is expressed in yeast transformed with the vectors, using the expression 



EP 0 388 232 A1 


conditions described supra, for the expression of the fusion polypeptide, Cl 00-3. The resulting polypeptides 
are screened using the sera from individuals with NANBH, described supra, for the screening of im- 
munogenic polypeptides encoded in HCV cDNAs expressed in E. coli. 


5 

Comparison of the Hydrophobic Profiles of HCV Polyproteins 

Virus NS1 


with West Nile Virus Polyprotein and with Dengue 


io The hydrophobicity profile of an HCV polyprotein segment was compared with that of a typical 
Flavivirus, West Nile virus. The polypeptide sequence of the West Nile virus polyprotein was deduced from 
the known polynucleotide sequences encoding the non-structural proteins of that virus. The HCV poly- 
protein sequence was deduced from the sequence of overlapping cDNA clones. The profiles were 
determined using an antigen program which uses a window of 7 amino acid width (the amino acid in 
rs question, and 3 residues on each side) to report the average hydrophobicity about a given amino acid 
residue. The parameters giving the reactive hydrophobicity for each amino acid residue are from Kyte and 
Doolittle (1982). Fig. 19 shows the hydrophobic profiles of the two polyproteins; the areas corresponding to 
the non-structural proteins of West Nile virus, nsl through nsS. are indicated in the figure. As seen in the 
figure, there is a general similarity in the profiles of the HCV polyprotein and the West Nile virus 
20 polyprotein. 

The sequence of the amino acids encoded in the 5 -region of HCV cDNA shown in Fig. 16 has been 
compared with the corresponding region of one of the strains of Dengue virus, described supra., with 
respect to the profile of regions of hydrophobicity and hydrophilicity (data not shown). This comparison 
indicated that the polypeptides from HCV and Dengue encoded in this region, which corresponds to the 
25 region encoding NS1 (or a portion thereof), have a similar hydrophobic/hydrophilic profile. 

The similarity in hydrophobicity profiles, in combination with the previously identified homologies in the 
amino acid sequences of HCV and Dengue Flavivirus in EP 0,218,316 suggests that HCV is related to these 
members of the Flavivirus family. 

30 

Characterization of the Putative Polypeptides Encoded Within the HCV ORF 

The sequence of the HCV cDNA sense strand, shown in Fig. 17, was deduced from the overlapping 
HCV cDNAs in the various clones described in EPO Pub. No. 318,216 and those described supra. It may be 
35 deduced from the sequence that the HCV genome contains primarily one long continuous ORF, which 
encodes a polyprotein. In the sequence, nucleotide number 1 corresponds to the first nucleotide of the 
initiator MET codon; minus numbers indicate that the nucleotides are that distance away in the s'-direction 
(upstream), while positive numbers indicate that the nucleotides are that distance away in the 3 -direction 
(downstream). The composite sequence shows the "sense” strand of the HCV cDNA. 

40 The amino acid sequence of the putative HCV poly protein deduced from the HCV cDNA sense strand 
sequence is also shown in Fig. 17. where position 1 begins with the putative initiator methionine. 

Possible protein domains of the encoded HCV polyprotein, as well as the approximate boundaries, are 
the following (the polypeptides identified within the parentheses are those which are encoded in the 
Flavivirus domain); 

45 
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5 


10 


15 

It should bo noted, however, that hydrophobicity profiles (described infra), indicate that HCV diverges 
from the Flavivirus model, particularly with respect to the region upstream of NS2. Moreover, the 
boundaries indicated are not intended to show firm demarcations between the putative polypeptides. 

20 

The Hydrophilic and Antigenic Profile of the Polypeptide 


Profiles of the hydrophilicity/hydrophobicity and the antigenic index of the putative polyprotein encoded 
25 in the HCV cDNA sequence shown in Fig. 16 were determined by computer analysis. The program for 
hydrophilicity/hydrophobicity was as described supra. The antigenic index results from a computer program 
which relies on the following criteria: 1) surface probability, 2) prediction of alpha-helicity by two different 
methods: 3) prediction of beta-sheet regions by two different methods: 4) prediction of U-turns by two 
different methods: 5) hydrophilicity/hydrophobicity: and flexibility. The traces of the profiles generated by 
30 the computer analyses are shown in Fig. 20. In the hydrophilicity profile, deflection above the abscissa 
indicates hydrophilicity, and below the abscissa indicates hydrophobicity. The probability that a polypeptide 
region is antigenic is usually considered to increase when there is a deflection upward from the abscissa in 
the hydrophilic and/or antigenic profile. It should be noted, however, that these profiles are not necessarily 
indicators of the strength of the immunogenicity of a polypeptide. 

35 

Identification of Co-iinear Peptides in HCV and Flaviviruses 

The amino acid sequence of the putative polyprotein encoded in the HCV cDNA sense strand was 
40 compared with the known amino acid sequences of several members of Flaviviruses. The companson 
shows that homology is slight, but due to the regions in which it is found, it is probably significant. The 
conserved colinear regions are shown in Rg. 21. The amino acid numbers listed below the sequences 
represent the number in the putative HCV polyprotein (See Rg. 17.) 

The spacing of these conserved motifs is similar between the Flaviviruses and HCV, and implies that 

45 there is some similarity between HCV and these flaviviral agents. 

The following listed materials are on deposit under the terms of the Budapest Treaty with the American 

Type -Culture Collection JATCC) , j[230 i Parklawn Dr., Rockville. Maryland 20852, and have been assigned 

the following Accession Numbers. “ ~ 


Putative Domain 

Approximate 

Boundary 

(amino acid 
nos.) 

n C" (nucleocapsid protein) 

1-120 

"E” (Virion envelope protein(s) and possibly matrix (M) proteins 

120-400 

"NSl" (complement fixation antigen?) 

400-660 

"NS2" (unknown function) 

660-1050 

"NS3” (protease?) 

1050-1640 

"NS4" (unknown function) 

1640-2000 

"NS5" (polymerase) 

2000-? end 


so 
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lambda-gtl 1 

ATCC 

No. 

Deposit Date 

HCV cDNA library 

40394 

1 Dec. 1987 

clone 81 

40388 

17 Nov. 1987 

clone 91 

40389 

17 Nov. 1987 

clone 1 -2 

40390 : 

17 Nov. 1987 

clone 5-1-1 

40391 ! 

18 Nov. 1987 

clone 12f 

40514 1 

10 Nov. 1988 

clone 35f 

40511 

10 Nov. 1988 

clone 15e 

40513 

10 Nov. 1988 

clone K9-1 

40512 

10 Nov. 1988 

JSC 308 

20879 

5 May 1 988 

pS356 

67683 

29 April 1988 


In addition, the following deposits were made on 11 May 1989. 


Strain 

Linkers 

ATCC 

No. 

D1210 (Cfl/5-1-1) 

EF 

67967 

D1210 (CM/81) 

EF 

67968 

D1210 (Cf1/CA74a) 

EF 

67969 

D1210 (Cfl/350 

AB 

67970 

D1210 (Cf1/279a) 

EF 

67971 

D1210 (Cf1/C36) 

CD 

67972 

01210 (Cf1/13i) 

AB 

67973 

D1210 (Cf1/C33b) 

EF 

67974 

D1210 (Cf1/CA290a) 

AB 

67975 

HB101 (AB24/C100 #3R) 


67976 


35 The following derivatives of strain D1210 were deposited on 3 May 1989. 


Strain Derivative ATCC 

No. 

pCF1CS/C8f 67956 

pCF1AB/C12f 67952 

pCFI EF/14C 67949 

pCF1EF/15e 67954 

pCF1AB/C25c 67958 

pCFI EF/C33C 67953 

pCFI EF/C33f 67050 

pCFlCD/33g 67951 

pCFlCD/C39c 67955 

pCFI EF/C40b 67957 

pCFI EF/CA1 67b 67959 


The following strains wer deposited on May 1 2, 1 989. 
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Strain 

ATCC 


No. 

Lambda gtl 1 (C35) 

40603 

Lambda gtl 0(beta-5a) 

40602 

D1210 (C40b) 

67980 

D1210 (M16) 

67981 


The deposited materials mentioned herein are intended for convenience only, and are not required to 
practice the present invention in view of the descriptions herein, and in addition these materials are 
incorporated herein by reference. 


Industrial Applicability 


The invention, in the various manifestations disclosed herein, has many industrial uses, some of which 
are the following. The HCV cDNAs may be used for the design of probes for the detection of HCV nucleic 
acids in samples. The probes derived from the cDNAs may be used to detect HCV nucleic acids in, for 
example, chemical synthetic reactions. They may also be used in screening programs for anti-viral agents, 
to determine the effect of the agents in inhibiting viral replication in cell culture systems, and animal model 
systems. The HCV polynucleotide probes are also useful in detecting viral nucleic acids in humans, and 

thus, may serve as a basis for diagnosis of HCV infections in humans. 

In addition to the above, the cDNAs provided herein provide information and a means for synthesizing 
polypeptides containing epitopes of HCV. These polypeptides are useful in detecting antibodies to HCV 
antigens. A series of immunoassays for HCV infection, based on recombinant polypeptides containing HCV 
epitopes are described herein, and will find commercial use in diagnosing HCV induced NANBH, in 
screening blood bank donors for HCV-caused infectious hepatitis, and also for detecting contaminated blood 
from infectious blood donors. The viral antigens will also have utility in monitoring the efficacy of anti-viral 
agents in animal model systems. In addition, the polypeptides derived from the HCV cDNAs disclosed 

herein will have utility as vaccines for treatment of HCV infections. 

The polypeptides derived from the HCV cDNAs, besides the above stated uses, are also useful for 
raising anti-HCV antibodies. Thus, they may be used in anti-HCV vaccines. However, the antibodies 
produced as a result of immunization with the HCV polypeptides are also useful in detecting the presence 
of viral antigens in samples. Thus, they may be used to assay the production of HCV polypeptides in 
chemical systems. The anti-HCV antibodies may also be used to monitor the efficacy of anti-viral agents in 
screening programs where these agents are tested in tissue culture systems. They may also be used for 
passive immunotherapy, and to diagnose HCV caused NANBH by allowing the detection of viral antigen(s) 
in both blood donors and recipients. Another important use for anti-HCV antibodies is in affinity chromatog- 
raphy for the purification of virus and viral polypeptides. The purified virus and viral polypeptide prepara- 
tions may be used in vaccines. However, the purified virus may also be useful for the development of cell 
culture systems in which HCV replicates. 

Antisense polynucleotides may be used as inhibitors of viral replication. 

For convenience, the anti-HCV antibodies and HCV polypeptides, whether natural or recombinant, may 
be packaged into kits. 


Claims 

1. A recombinant polynucleotide comprising a sequence derived from HCV cDNA, wherein the HCV 
cDNA is in clone I3i, or clone 26j, or clone 59a. or clone 84a, or clone CA156e. or clone 167b, or clone 
pi 14a, or clone CA216a, or clone CA290a, or clone ag30a, or cion 205a, or clone 18g, or clone 16jh. or 
wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in 
Fig. 17. 

2. A recombinant polynucleotide according to claim 1, encoding an epitope of HCV. 

3. A recombinant vector comprising the polynucleotide of claim 1 or claim 2. 

4. A host cell transformed with the vector of claim 3. 
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5. A recombinant expression system comprising an open reading frame (ORF) of DNA derived from the 
recombinant polynucleotide of claim 1 or claim 2, wherein the ORF is operably linked to a control sequence 
compatible with a desired host. 

6. A cell transformed with the recombinant expression system of claim 5. 

5 7. A polypeptide produced by the cell of claim 6. 

8. A purified polypeptide comprising an epitope encoded within HCV cDNA wherein the HCV cONA is 
of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17. 

9. An immunogenic polypeptide produced by a cell transformed with a recombinant expression vector 
comprising an ORF of DNA derived from HCV cDNA, wherein the HCV cDNA is comprised of a sequence 

io derived from the HCV cDNA sequence in clone CA279a, or clone CA74a. or clone 13i, or clone CA290a. or 
clone 33C or clone 40b. or clone 33b, or clone 25c, or clone 1 4c, or clone 8f, or clone 33f, or clone 33g, or 
clone 39c. or clone 1 5e, and wherein the ORF is operably linked to a control sequence compatible with a 
desired host. 

10. A peptide comprising an HCV epitope, wherein the peptide is of the formula 
75 AA„-AA y , 

wherein x and y designate amino acid numbers shown in Fig. 17. and wherein the peptide is selected from 
the group consisting of AA1-AA25, AA1-AA50, AA1-AA84, AA9-AA1 77. AA1-AA10. AA5-AA20, AA20-AA25, 
AA35-AA45, AA50-AA100, AA40-AA90, AA45-AA65. AA65-AA75. AA80-90, AA99-AA120, AA95-AA110, 
AA105-AA120, AA100-AA150. AA150-AA200. AA155-AA170, AA190-AA210. AA200-AA250, AA220-AA240, 
20 AA245-AA265, AA250-AA300. AA290-AA330, AA290-305, AA300-AA350, AA310-AA330, AA350-AA400, 

AA380-AA395, AA405-AA495. AA400-AA450, AA405-AA415. AA415-AA425, AA425-AA435. AA437-AA582. 
AA450-AA500, AA440-AA460, AA460-AA470, AA475-AA495. AA500-AA550, AA511-AA690. AA515-AA550, 
AA550-AA600, AA550-AA625, AA575-AA605, AA585-AA600. AA600-AA650. AA600-AA625. AA635-AA665, 
AA650-AA700, AA645-AA680, AA700-AA750, AA700-AA725. AA700-AA750, AA725-AA775. AA770-AA790, 
25 AA750-AA800, AA800-AA815, AA825-AA850, AA850-AA875. AA800-AA850, AA920-AA990, AA850-AA900. 

AA920-AA945, AA940-AA965. AA970-AA990, AA950-AA1000. AA1000-AA1060. AA1000-AA1025. AA1000- 
AA1050, AA1025-AA1040, AA1 04O-AA1 055, AA1075-AA1175, AA1050-AA1200. AA1070-AA1 100. AA1100- 
AA1130, AA1140-AA1165, AAI192-AA1457. AA1 1 95-AA1 250, AA1200-AA1225. AA1 225-AA1 250. AA1250- 
AA1300. AA1 260-AA1310, AA1 260-AA1 280, AA1266-AA1428, AA1300-AA1350, AA1290-AA1310, AA1310- 
30 AA1340, AA1 345-AA1 405, AA1345-AA1365, AA1350-AA1400, AA1365-AA1380. AA1380-AA1405, AA1400- 

AA1450, AA1 450-AA1 500, AA1460-AA1475, AA1475-AA1515, AA1475-AA1500. AA1500-AA1550, AA1500- 
AA1515, AA1 51 5-AA1 550, AA1550-AA1600, AA1 545-AA1 560, AA1569-AA1931. AA1570-AA1590, AA1595- 
AA1610, AA1 590-AA1 650. AA1610-AA1645, AA1650-AA1690, AA1685-AA1770. AA1689-AA1805. AA1690- 
AA1720, AA1 694-AA1 735, AA1720-AA1745, AA1745-AA1770, AA1750-AA1800, AA1775-AA1810, AA1795- 
35 AA1850, AA1 850-AA1 900, AA1900-AA1950, AA1900-AA1920, M1916-AA2021, AA1920-AA1940. AA1949- 

AA2124, AA1 950-AA2000, AA1950-AA1985, AA1980-AA2000. AA2000-AA2050. AA2005-AA2025, AA2020- 
AA2045, AA2045-AA21 00 , AA2045-AA2070, AA2054-AA2223, AA2070-AA2100, AA21 00-AA21 50. AA2150- 
AA2200. AA2200- AA2250 , AA2200-AA2325. AA2250-AA2330, AA2255-AA2270. AA2265-AA2280. AA2280- 
AA2290, AA2287- AA2385 , AA2300-AA2350, AA2290-AA2310, AA2310-AA2330, AA2330-AA2350. AA2350- 
40 AA2400, AA2348-AA2464. AA2345-AA2415, AA2345-AA2375, AA2370-AA2410, AA2371 -AA2502, AA2400- 
AA2450. AA240O-AA2425, AA2415-AA2450, AA2445-AA2500, AA2445-AA2475, AA2470-AA2490. AA2500- 
AA2550, AA2505-AA2540, AA2535-AA2560, AA2550-AA2600, AA2560-AA2580, AA2600-AA2650. AA2605- 
AA2620, AA2620-AA2650, AA2640-AA2660, AA2650-AA2700, AA2655-AA2670, AA2670-AA2700. AA2700- 
AA2750, AA2740-AA2760, AA2750-AA2800. AA2755-AA2780, AA2780-AA2830, AA2785-AA2810. AA2796- 
45 AA2886, AA281 0-AA2825, AA2800-AA2850, AA2850-AA2900, AA2850-AA2865. AA2885-AA2905, AA2900- 
AA2950, AA291 0-AA2930, AA2925-AA2950, AA2945-end(C' terminal). 

1 1. A polypeptide comprised of the peptide of claim 10. 

12. An immunogenic polypeptide attached to a solid substrate, wherein the polypeptide is according to 
claim 7. or claim 8, or claim 9, or claim 10, or claim 11, or wherein the polypeptide is comprised of an 

so epitope encoded within HCV cDNA wherein the HCV cDNA is of a sequence indicated by nucleotide 
numbers -319 to 1348 or 8659 to 8866 in Fig. 17. 

13. A monoclonal antibody directed against an epitope encoded in HCV cDNA. wherein the HCV cDNA 
is of a sequence indicated by nucleotid numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is the 
sequence present in cion 13i, or clone 26j, or clone 59a, or cion 84a. or clone CA156e, or clone 167b, or 

55 clone pi 14a. or clone CA21 6a, or cion CA290a, or cion ag30a, or clone 205a. or clone 18g, or clone 16jh. 

14. A preparation of purified polyclonal antibodies directed against a polypeptide comprised of an 
epitope encoded within HCV cDNA, wherein the HCV cDNA is of a sequenc indicated by nucleotide 
numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is th sequence present in clone 13i, or clone 26j, or 
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clone 59a, or clone 84a t or clone CA156e, or clone 167b, or clone pi 14a, or clone CA216a, or clone 
CA290a, or clone ag30a, or clone 205a, or clone 18g, or clone 16jh. 

15. A polynucleotide probe for HCV, wherein the probe is comprised of an HCV sequence derived from 
an HCV cDNA sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or from 

s the complement of the HCV cDNA sequence. 

16. A kit for analyzing samples for the presence of polynucleotides from HCV comprising a poly- 
nucleotide probe containing a nucleotide sequence of about 8 or more nucleotides, wherein the nucleotide 
sequence is derived from HCV cDNA which is of a sequence indicated by nucleotide numbers -319 to 1348 
or 8659 to 8866 in Fig. 1 7, wherein the polynucleotide probe is in a suitable container. 

to 17, A kit for analyzing samples for the presence of an HCV antigen comprising an antibody which 
reacts immunoiogically with an HCV antigen, wherein the antigen contains an epitope encoded within HCV 
cDNA which is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or 
wherein the HCV cDNA is in clone 13i, or clone 26j, or clone 59a, or clone 84a, or clone CA156e, or clone 
167b, or clone pi14a. or clone CA216a, or clone CA290a, or clone ag30a, or clone 205a, or clone 18g, or 
is clone 16jh. 

18. A kit for analyzing samples for the presence of an HCV antibody comprising an antigenic 
polypeptide containing an HCV epitope encoded within HCV cDNA which is of a sequence indicated by 
nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is in clone 13i, or clone 26j, or clone 59a, or 
clone 84a, or clone CA156e. or clone 167b, or clone pi 14a, or clone CA216a, or clone CA29Ga, or clone 

20 ag30a, or clone 205a, or clone I8g, or clone I6jh. 

19. A kit for analyzing samples for the presence of an HCV antibody comprising an antigenic 
polypeptide expressed from HCV cDNA in clone CA279a, or clone CA74a, or clone 13i, or clone CA290a, or 
clone 33C or clone 40b, or clone 33b, or clone 25c, or clone 14c, or clone 8f, or clone 33f, or clone 33g, or 
clone 39c, or clone 15e, wherein the antigenic polypeptide is present in a suitable container. 

25 20. A method for detecting HCV nucleic acids in a sample comprising: 

(a) reacting nucleic acids of the sample with a polynucleotide probe for HCV, wherein the probe is 
comprised of an HCV sequence derived from an HCV cDNA sequence is of a sequence indicated by 
nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, and wherein the reacting is under conditions 
which allow the formation of a polynucleotide duplex between the probe and the HCV nucleic acid from the 

30 sample, 

(b) detecting a polynucleotide duplex which contains the probe, formed in step (a). 

21 . An immunoassay for detecting an HCV antigen comprising: 

(a) incubating a sample suspected of containing an HCV antigen with an antibody directed against an HCV 
epitope encoded in HCV cDNA, wherein the HCV cDNA is of a sequence indicated by nucleotide numbers 
35 -319 to 1348 or 8659 to 8866 in Fig. 17. or is the sequence present in clone 13i, or clone 26j, or clone 59a, 

or clone 84a, or clone CA156e, or clone 167b, or clone pi 14a, or clone CA216a, or clone CA290a, or clone 
ag30a, or clone 205a, or clone 18g, or done I6jh. and wherein the incubating is under conditions which 
allow formation of an antigen-antibody complex; and (b) detecting an antibody-antigen complex formed in 
step (a) which contains the antibody. 

40 22. An immunoassay for detecting antibodies directed against an HCV antigen comprising: 

(a) incubating a sample suspected of containing anti-HCV antibodies with an antigen polypeptide 
containing an epitope encoded in HCV cDNA, wherein the HCV cDNA is of a sequence indicated by 
nucleotide numbers -319 to 1348 or 8659 to 8866 in Fig. 17, or is the sequence present in clone 13i, or 
clone 26j, or clone 59a, or clone 84a, or clone CA156e, or clone 167b, or clone pi 14a, or clone CA216a, or 
45 clone CA290a, or clone ag30a, or clone 205a, or clone I8g, or clone 16jh, and wherein the incubating is 
under conditions which allow formation of an antigen-antibody complex; and 
— __ (b) detecting_an_antibody^antigen^mpl^(Jormed in step (a) which contains the antigen polypeptide. 

23. An immunoassay for detecting antibodies directed against arTHCV ahtigen _ comprising:- 

(a) incubating a sample suspected of containing anti-HCV antibodies with the polypeptide of claim 9, 
so under conditions which allow formation of an antigen-antibody complex; and 

(b) detecting an antibody-antigen complex formed in step (a) which contains the antigen polypeptide. 

24. A vaccine for treatment of HCV infection comprising an immunogenic polypeptide containing an 
HCV epitope encoded in HCV cDNA, wherein the HCV cDNA is of a sequence indicated by nucfeotide 
numbers -319 to 1348 or 8659 to 8866 in Fig. 17 or is the sequenc present in cion 13i, or clone 26j. or 

55 clone 59a, or clone 84a, or clone CA156e, or clone 167b, or clone pi 14a, or cion CA2l6a, or clone 
CA290a, or cion ag30a, or cion 205a, or clone I8g. or clone 16jh, and wher in th immunogenic 
polypeptide is present in a pharmacologically effective dose in a pharmaceutically acc ptable excipient. 

25. A method for producing antibodies to HCV comprising administering to an individual an isolated 
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immunogenic polypeptide containing an HCV epitope encoded in HCV cDNA, wherein the HCV cDNA is of 
a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Rg. 17, or is of the sequence 
present in clone CA279a, or clone CA74a, or clone 13i, or clone CA290a. or clone 33C or clone 40b, or 
clone 33b, or clone 25c, or clone 14c, or clone 8f, or clone 33f, or clone 33g, or clone 39c, or clone 15e, 
and wherein the immunogenic polypeptide is present in a pharmacologically effective dose in a pharmaceu- 
tically acceptable excipient. 

26. An antisense polynucleotide derived from HCV cDNA, wherein the HCV cDNA is that shown in Rg. 
17. 

27. A method for preparing purified fusion polypeptide Cl 00-3 comprising: 

(a) providing a crude cell lysate containing polypeptide Cl 00-3, 

(b) treating the crude cell lysate with an amount of acetone which causes the polypeptide to 
precipitate, 

(c) isolating and solubilizing the precipitated material, 

(d) isolating the Cl 00-3 polypeptide by anion exchange chromatography, and 

(e) further isolating the Cl 00-3 polypeptide of step (d) by gel filtration. 

28. A method for preparing an HCV polypeptide comprising: 

(a) providing a host cell transformed with a recombinant expression system comprising an open 
reading frame (ORF) of DNA derived from HCV cDNA, wherein the HCV cDNA is in clone 13i, or clone 26j, 
or clone 59a, or clone 84a. or clone CA156e, or clone 167b, or clone pi14a. or clone CA2l6a, or clone 
CA290a, or clone ag30a, or clone 205a; or clone 18g, or clone 16jh, or wherein the HCV cDNA is of a 
sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Rg. 17, wherein tho ORF is 
operably linked to a control sequence compatible with a desired host; and 

(b) incubating the host ceil under conditions with allow expression of the HCV polypeptide. 

29. A method for preparing an immunogenic HCV polypeptide comprising: 

(a) providing a host cell transformed with a recombinant expression vector comprising an ORF of 
DNA derived from HCV cDNA, wherein the HCV cDNA is comprised of a sequence derived from the HCV 
cDNA sequence in clone CA279a, or clone CA74a, or clone 13i, or clone CA290a, or clone 33c, or clone 
,40b. or clone 33b, or clone 25c, or clone 14c, or clone 8f, or clone 33f, or clone 33g. or clone 39c, or clone 
15e, wherein the OEF is operably linked to a control sequence compatible with the desired host; and 

(b) incubating the host cell under conditions which allow expression of the HCV polypeptide. 

30. A method for preparing a host cell transformed with a recombinant polynucleotide comprising a 
sequence of HCV cDNA derived from the HCV cDNA in clone 13i, or clone 26j, or clone 59a, or clone 84a, 
or clone CA156e. or clone 167b, or clone pi 14a, or clone CA216a, or clone CA290a, or clone ag30a, or 
clone 205a, or clone 18g, or clone 16jh, or wherein the HCV cDNA is of a sequence indicated by nucleotide 
numbers -319 to 1348 or 8659 to 8866 in Rg. 17 comprising: 

(a) providing a host cell capable of transformation; 

(b) providing the recombinant polynucleotide; and 

(c) incubating (a) with (b) under conditions which allow transformation of the host cell with the 
polynucleotide. 

31. A method for preparing a recombinant polynucleotide comprised of a sequenca of HCV cDNA 
derived from the HCV cDNA in clone 131, or clone 26j, or clone 59a, or clone 84a, or clone CA156e. or 
clone 167b. or clone pi14a, or clone CA216a, or clone CA290a. or clone ag30a, or clone 205a, or clone 18g, 
or clone 16jh, or wherein the HCV cDNA is of a sequence indicated by nucleotide numbers -319 to 1348 or 
8659 to 8866 in Rg. 17 comprising: 

(a) providing a host cell transformed with the recombinant polynucleotide; and 

(b) isolating said polynucleotide from said host cell. 

32. A method for preparing blood free of HCV comprising: 

(a) providing a sample of blood suspected of containing HCV and anti-HCV antibodies; 

(b) providing an immunogenic polypeptide prepared according to claim 28 or 29; 

(c) incubating the sample of (a) with the immunogenic polypeptide of (b) under conditions which allow 
the formation of antibody-HCV polypeptide complexes; 

(d) detecting the complexes formed in step (c); and 

(e) saving the blood from which com pi x s were not detected in (d). 

33. A method for preparing blood free of HCV comprising: 

(a) providing nucleic acids from a sampl of blood suspected of containing HCV polynucleotides; 

(b) providing a prob for HCV, wherein the prob is comprised of an HCV sequence derived from an 
HCV cDNA which is of a sequence indicated by nucleotide numbers -319 to 1348 or 8659 to 8866 in Rg. 
17, 
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(c) reacting (a) with (b) under conditions which allow the formation of a polynucleotide duplex 
between the probe and the HCV nucleic acid from the sample; 

(d) detecting a polynucleotide which contains th probe, formed in step (c); and 

(e) saving the blood from which complexes were not detected in (d). 

5 34. A method for producing a hybridoma which produces anti-HCV monoclonal antibodies comprising: 

(a) immunizing an individual with an immunogenic polypeptide containing an epitope encoded in HCV 
cDNA, wherein the HCV cDNA is HCV cDNA in clone 13i, or clone 26j, or clone 59a. or clone 84a, or clone 
CA156e, or clone 167b, or clone pi 14a, or clone CA216a, or clone CA290a, or clone ag30a, or clone 205a, 
or clone 18g, or clone 16jh, or wherein the HCV cDNA is of a sequence indicated by nucleotide numbers 

io -319 to 1348 or 8659 to 8866 in Fig. 17; or 

(b) immunizing an individual with an immunogenic polypeptide prepared according to claim 29; 

(c) immortalizing antibody producing cells from the immunized individual; 

(d) selecting an immortal cell which produces antibodies which react with an HCV epitope in the 
immunogenic polypeptide of (a) or (b); and 

75 (e) growing said immortal cell. 


20 
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Translation of DNA l2f 


IlePheLysIleArgMetTyrValGlyGlyValGluHiaArcLeuGluAlaAlaCysAsn 

CCATATTTAAAATCAGGATGTACGTGGGAGGCCTCGAACACAGCCTGGAAGCTGCCrGCA 

GGTATAAATTTTAGTCCTACATGCACCCTCCCCAGCTTGTGTCCGACCTTCGACGGACGT 


TrpThrArgGlyGluArgCyaAspLeuGluAspArgAspArgSerGluLeuSerProLeu 
6 1 ACTGGACGCGGGGCGAACGTTGCGATCTGGAACACAGGGACAGGTCCGAGCTCAGCCCGT 
TGACCTGCGCCCCGCTTGCAACGCTAGACCTTCTGTCCCTGTCCAGGCTCGAGTCGGGCA 


121 


LeuLeuThrThrThrGlnTrpGlnValX^uProCysSerPheThrThrLeuProAlaLeu 

TACTGCTGACCACTACACAGTGGCAGGTCCTCCCGTGTTCCTTCACAACCCTACCAGCCT 

ATGACGACTGGTGATGTGTCACCGTCCAGGAGGGCACAAGGAAGTGTTGGGATGGTCGGA 


SerThrClyLeuIleHisLeuHisGlnAsnlleValAspValGlnTyrLeuTyrGlyVal 
181 TGTCCACCGGCCTCATCCACCTCCACCAGAACATTGTGGACGTGCAGTACTTGTACGGGG 

acaggtggccggagtaggtggaggtggtcttgtaacacctgcacgtcatgaacatgcccc 


GlySerSerlleAlaSerTrpAlalleLysTrpGluTyrValValtauLeuPheLeuLeu 
241 TGGGGTCAAGCATCGCGTCCTGGGCCATTAAGTGGGAGTACGTCGTTCTCCTGTTCCTTC 
ACCCCAGTTCGTAGCGCAGGACCCGGTAATTCACCCTCATGCAGCAAGAGGACAAGGAAG 


. . LeuAlaAspAlaArgValCysSerCysLeuTrpMetMetLeuLeuIleSerGlnAlaGlu 
301 TGCTTGCAGACGCGCGCGTCTGCTCCTGCTTGTGGATGATCCTACTCATATCCCAAGCGG 
ACGAACGTCTGCGCGCGCAGACGAGGACGAACACCTACTACGATGAGTATAGGGTTCGCC 

Overlap with 14i 

AlaAlaLeuGluAsnLeuValXleLeuAsnAlaAlaSerLeuAlaGlyThrHisGlyLeu 
361 AGGCGGCTTTGGAGAACCTCGTAATACTTAATGCAGCATCCCTGGCCGGGACGCACGGTC 
‘ TCCGCCGAAACCTCTTGGAGCATTATGAATTACGTCGTAGGGACCGGCCCTGCGTGCCAG 


Val 

421 TTGTATC 
AACATAG 




FIGURE 1 


EP 0 388 232 A1 


121 


181 


241 


301 


361 


421 


Translation of ON’A *9-1 

- °^ZS^H52S^Ii Ar5LeuAl4SerC ’> r,Ar 9 Prot * euThrAs P ph e A *PGlnGlyTrpGly 

CAGGCTGTCCTGAGAGCCTAGCCAGCTGCCGACCCCTTACCGArTTTGACCAGGGCTGGG 

G7CCGACAGGACTCTCCCATCGGTCGACGGCTGGGCAATGGCTAAAACTGGTCCCGACCC 

ProlleSerTyrAlaAanOiySerGlyProAapGlnArgProTyrCysTrpHlsTyrPro 

gccctatcagttatgccaacggaagcggccccgaccagcgcccctactgctggcactacc 

CGGGATAGTCAATACGGTTGCCTTCGCCGCGGCTGGTCGCGGGGATGACGACCGTGATGG 

ProLysProCysGlyllaValProAlalysSerValCysGlyProValTyrCysPheThr 

CCCCAAAACCTTGCGGTATTGTGCCCGCGAAGAGTGTCTGTGGTCCGGTATATTGCTTCA 

GGGGTTTTGGAACGCCATAACACGGGCGCTTCTCACACACACCAGGCCATATAACGAAGT 

ProSerProValValValGlyThrThrAspArgSorGlyAlaProThrTyrSarTrpGly 

CTCCCAGCCCCGTGGTGGTGGGAACGACCGACAGGTCGGGCGCGCCCACCTACAGCTGGG 

GAGGGTCGGCGCACCACCACCCTTGCTGGCTGTCCAGCCCGCGCGGGTGGATGTCGACCC 

GluAsnAjpThrAspValPheValLauAsnAsnThrArgProProLeuGlyAsnTrpPhe 

GTGAAAATGATACGGACGTCTTCGTCCTTAACAATACCAGGCCACCGCTGGGCAATTGGT 

CACTTTTACTATGCCTGCAGAAGCAGGAATTGTTATGGTCCGGTGGCGACCCGTTAACCA 

GlyCysThrTrpMetAsnSerThrGlyPheThrl.ysValCyiGlyAlAProProCy*vaf 

TCGGTTGTACCTGGATGAACTCAACTGGATTCACCAAAGTGTGCGGAGCGCCTCCTTGTG 

AGCCAACATGGACCTACTTGAGTTGACCTAAGTGGTTTCACACGCCTCGCGGAGGAACAC 

IleGlyGlyAlaGlyAsnAsnThrLeuHlsCysProThrAspCysPheArgLysHlsPro 

TCATCGGAGGGGCGGGCAACAACACCCTGCACTGCCCCACTGATTGCrrCCGCAAGCATC 

AGTAGCCTCCCCGCCCGTTGTTGTGGGACGTGACGGGGTGACTAACGAAGGCGTTCGTAG 

A fP A ^ T l} rryrSerAr '7 c ysGlySe r Gly p roTrpIl«ThrProArgCysLeuValAsp 

CGGACGCCACATACTCTCGGTGCGGCTCCGGTCCCTGGATCACACCCAGGTGCCTGGTCG 

GCCTGCGGTGTATGAGAGCCACGCCGAGGCCAGGGACCTAGTGTGGGTCCACGGACCAGC 


TyrProTyrArgLeuTrpHisTyrProCysThrIl«AsnTyrThrIlaPheI,ysIl«Arg 
481 ACTACCCGTATAGGCTTTGGCATTATCCTTGTACCATCAACTACACTATATTTAAAATCA 

- . tgatgggcatatccgaaaccgtaataggaacatggtagttgatgtgatataaattttagt 


MatTyTValGlyGlyValGluHisArgLauGluAlaAlaCysAsnTrpThrArgGlyGlu 
541 GGATGTACGTGGGAGGGGTCGAGCACAGGCTGGAAGCTGCCTGCAACTGGACGCGGGGCG 
CCTACATGCACCCTCCCCAGCTCGTGTCCGACCTTCGACGGACGTTGACCTGCGCCCCGC 


1 X I T 


ArgCysAjpLeuGluAspArgAspArgSerCluLeuSerProLeuLeuLeuThrThrThr 
601 AACGTTGCGATCTGGAAGATAGGGACAGGTCCGAGCTCAGCCCGTTACTGCTGACCACTA 
TTGCAACGCTAGACCTTCTATCCCTGTCCAGGCTCGAGTCGGGCAATGACGACTGGTGAT 


GlnTrpGlnValLauProCysSarPhaThrThrLeuProAlaLauSerThxClyLeuIle 
661 CACAGTGGCAGGTCCTCCCGTGTTCCTTCACAACCCTGCCAGCCTTGTCCACCGGCCTCA 
GTGTCACCGTCCAGGAGGGCACAAGGAAGTGTTGGGACGGTCGGAACAGGTGGCCGGAGT 

Overlap with Combined ORF of DNAj I2f through ISe— — 

HlsLeuHlsGlnAsnlleValAspValGlnTyrLeuTyzGlyValGlySerSerZleAla 
721 TCCACCTCCACCAGAACATTGTGGACGTGCAGTACTTGTACGGGGTGGGGTCAAGCATCG 
AGGTGGAGGTGGTCTTGTAACACCTGCACGTCATGAACATGCCCCACCCCAGTTCGTAGC 


SerTrpAlalleLyiTrpGluTyrValYalLeuLeuPheLeuLeuteuAlaAapAlaArg 
761 CGTCCTGGGCCATTAAGTGGGAGTACGTCGTCCTCCTGTTCCTTCTGCTTGCAGACC GGC 
GCAGGACCCGGTAATTCACCCTCATGCAGCAGGAGGACAAGGAAGACGAACGTCTGCGCG 


FIGURE 2-1 
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valCysSerCysLeuTrpMeO'atLauI.auileSarGlnAlaCluAlaAlaLeuGluAsn 
84 1 CCGTCTGCTCCTGCTTGTGGATGATGCTACTCATATCCCAAGCGGAACCGGCTTTGGAGA 
CGCAGACGAGGACGAACACCTACTACGATGAGTATAGGGTTCGCCTTCGCCGAAACCTCT 


901 


LauVallleLauAjRAlaAlaSaxLauAlaGlyThrHisGlyL«uValS«rPheL*uVal 

ACCTCGTAATACTTAATGCAGCATCCCTGCCCGGCACGCACGGTCTTGTATOCTTCCTCG 

TGGAGCAtTATGAATTACGTCGTAGGGACCGGCCCTGCGTGCCAGAACATAGGAAGGAGC 


PhePheCyiPhaAlaTrpTyrLauLysGlyLyaTrpValProGlyAlaValTyrxKrPh* 
961 TGTTCTTCTGCTTTGCATGGTATCTGAAGGGTAAGTGGGTGCCCGGAGCG6TCTACACCT 
ACAAG AAG ACGAAACGTAC CAT AGACTTCC CATTCACCCACGGGCC TCGCCAGATGTC6A 

^ * 

Ty rC 1 yMetTrp ProI*uLeuI*ul*uL%uI*uAliI*uProG lnArgAitTyrAlaLeu 
1021 TCTACGGGATGTGGCCTCTCCTCCTGCTCCTGTTGGCGTTGCCCCAGGGGGCGTACGCGC 
AGATGCCCTACACCGGAGAGGAGGACGAGGACAACCGCAACGGGGTCGCCCGCATGCGCG 


AspThrGluValAlaAlaSerCysGlyGlyValVaU.euValGlyLeuMetAlaL«uThr 
1031 TGGACACGGAGGTGGCCGCGTCGTGTGGCGGTGTTGTTCTCGTCGGGTTGATGGCGCTAA 
ACCTGTGCCTCCACCGGCGCAGCACACCGCCACAACAAGAGCAGCCCAACTACCGCGATT 


LauSarProTyrTyrXysArgTyrIlaSarTrpCytlauTrpTrpiauGlnTyrPhal.au 
114 1 CTCTGTCACCATATTACAAGCGCTATATCAGCTGGTGCTTGTGGTGGCTTCAGTATTTTC 
GAGACAGTGGIATAATGTTCGCGATATAGTCGACCACGAACACCACCGAAGICAXAAAAG 


ThrArgValGluAlaGlnLeuHlsValTrptlaProProLeuAsnValArgGlyGlyArg 
120 1 TGACCAGAGTGGAAGCGCAACTGCACGTGTGGATTCCCCCCCTCAACGTCCGAGGGGGGC 
actggtctcaccttcgcgttgacgtgcacacctaagggggggagttgcaggctccccccg 


. . .AspAlaValXleLauLauKetCysAlaValKisProThrLauValPheAapXlaThrCys 
1261 GCG ACGCTGTCATCTTACTCATGTGTGCTGTACACCCG ACTCTGGTAtTTGACATCACCA 
CGCTGCGACAGTAGAATGAGTACACACGACATGTGGGCXGAGACCATAAACTGTAGTGGT 


LeuLauLe uAl a Va 1 P heG 1 y P roLeuTrp 1 1 eLauG InAla 
1321 AATTGCTGCTGGCCGTCTTCGGACCCCTTTGGATT C rrCAAGCCAG 
TTAACGACGACCGGCAGAAGCCTGGGGAAACCTAAGAAGTTCGGTC 


FIGURE 2-2 
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Translation of DNA 15e 


1-- 



ssss ^SSSSSSSSSS8S^^^^^ 


61 


^!WM»SauSlc»cIctcS«^TTcc^CT“ , ““ t ^^'^ 


121 


191 


CMGGASra^SScCC^j[Sc^TCT^iX^r? CySTyrSerIleGI, - i 

_GTCCCTGGTCGAACTTGTCCCeflSc?T?SSS2£i;7ST^?5??? c CTGCTACTCCATAGA 


«i accacttcatctacctcc^tcattcaaagactc 

tggtgaactagatggagwJSSS^^ 


aacgctctagatgcccccgacgatgacStatct- 


FIGURE 3 
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Translation of DNA 13i 


1 

“ seIsSSSsSISSS" 

^ccaacatwacctacttgagttsacctaaotcgtttcacacccctcccscaggaJcac 

AGTAGCCTCCCCGCCCGTTGTTGTGGGACGTGACGGGGTGACTAACGAAGGCGTTCGTAG 

m ecJSSSSSSSSSgg^^ 

gcctgcggtgtatgagagccacgccgaggcca5^S^SSSSgc 


■•• sMSSsi^sSS’sSSSffi’ 


"‘ “K^SlSsSSsSlSSilSsf" 

* * * • 

” -Overlap with 12 f 


4,1 

gtgtcaccgtccaggagggcacaaggaagtgttgggacggtcggaacaggtggccggagt 


FIGURE 4 
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61 


121 


181 


Translation of DNA 26 j 

CGAC^GWCCCCWCTCCTGGCACTACCCCCCAAAACCTTGCGGTATTGTGCCCGCGAA 3 

GCTGGTCGCGGGGATGACGACCGTGATGGGGGGTTTTCGAACGCCATAACACGGGCGCTT 

r f*^i$y*Si^5« al ^ rCysPheT ^ p ^«P?oValvaivir” 

rTrI?IrI^^SS? STATATTGCTTCACTCCC:AGCCCCGT GGTGGTGGG 

CTCACACACACCAGGCCATATAACGAAGTGAGGGTCGGGGCACCACCACCC 


FIGURE 5 
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Translation of DNA CA59a 


LeuValMetAlaGlnLeuLeuArglleProGlnAlalleLeuAspMetlleAlaGlyAla 
1 TTGGTAATGGCTCAGCTGCTCCGGATCCCACAAGCCATCTTGGACATGATCGCTGGTGCT 
AACCATTACCGAGTCGACGAGGCCTAGGGTGTTCGGTAGAACCTGTACTAGCGACCACGA 

HisTrpGlyValLeuAlaGlylleAlaTyrPheSerMetValGlyAsnTrpAlaiysVal 
61 CACTGGGGAGTCCTGGCGGGCATAGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTC 
GTGACCCCTCAGGACCGCCCGTATCGCATAAAGAGGTACCACCCCTTGACCCGCTTCCAG 


121 


LeuValValLeuLeuleuPheAlaGlyValAspAlaGluThrHisValThrGlyGlySer 

CTGGTAGTGCTGCTGCTATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAGT 

GACCATCACGACGACGATAAACGGCCGCAGCTGCGCCTTTGGGTGCAGTGGCCCCCTTCA 


, AlaGlyHiaThrValSerGlyPheValSerLeuLeuAlaPrcGlyAlaLysGlnAsnVal 

181 gccggccacactgtgtctggaiti^iiagcctcctcgcaccaggcgccaagcagaacgtc 

CGGCCGGTGTGACACAGACCTAAACAATCGGAGGAGCGTGGTCCGCGGTTCGTCTTGCAG 

GlnLeuIleAsnThrAsnGlySerTrpHisLeuAsnSerThrAlaLeuAsnCysAsnAsp 
241 CAGCTGATCAACACCAACGGCAGTTGGCACCTCAATAGCACGGCCCTGAACTGCAATGAT 
GTCGACTAGTTGTGGTTGCCGTCAACCGTGGAGTTATCGTGCCGGGACTTGACGTTACTA 


301 


SerLeuAsnThrGlyTrpLeuAlaGlyLeuPheTyrHisHisLysPheAsnSerSerGly 

AGCCTCAACACCGGCTGGTTGGCAGGGCTTTTCTATCACCACAAGTTCAACTCTTCAGGC 

TCGGAGTTGTGGCCGACCAACCGTCCCGAAAAGATAGTGGTGTTCAAGTTGAGAAGTCCG 

- - Overlap with 26 j 


Overlap with K9-1 

CysProGluArgLeuAlaSerCysArgPro 
3 6 1 TGTCCTGAGAGGCTAGCCAGCTGCCGACCCC 
ACAGGACTCTCCGATCGGTCGACGGCTGGGG 


a 


FIGUKE 6 
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Translation of DMA CA84a 

GCCTTCCA.CGTTAAC «£S3^&2gggg2*gg™*g 


“ “SiiSISEfSiS” 

-d^nnHinra' 

Overlap with CA59a--~— — — — — _ __ 

AspAlaGluThrHisValThrGiy 
241 TCGACGCGGAAACCCACGTCACCGGGG 
AGCIGCGCCITTGGGTGCAGTGGCCCC 


FIGURE 7 



EP 0 388 232 A1 


Translation of DNA CA156e 

CACAACCCACCGCTACTGGGGATGCCACCGGTGGTCCCTACCGTTTGAGGGGCGCTGCGT 


61 


121 


CCMCCI^(iTt;iMCIAgACCMOLi:^;caS^ggI ^?I^^?^iff T 

CCCCCTGGATACGCCCAGACAGAAAGAACAGCCGGTTGACAAGTGGAAGAGAGCGTCCGC 


GGTGACCTGCTGCGTTCCAACGTTAACGAGATAGATAGGGCCGGTATATTGCCCAGTGGC 


•Overlap with CA84a' 


- gg-BSiSBSgjjij- 


301 


LeuArg IleProG InAl a 
GCTCCGGATCCCACAAGCC 
CGAGG.CCTAGGGTGTTCGG 


FIGURE 8 • 
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Translation of DNA CA167b 


1 rTC^r^rrTT^?r»^i^»?!^ SpCysProAsnS * rSerIleValT y rGluA la 

ssassssssssssssgssss^sgsssssssss^ssssssgss 

CCGGCTACGGTAGGACGTGTGAGGCCCCACGCAGGGAACGCAAGCACTCCCGTTGCGGAG 


121 


181 


ciccacaacccaccgctici^at<kcaccmt^?k?Sg™SS 

r — r overlap with 

<n£SS3eSg^3SS^ 

CGTCGAAGCTGCAGTGTAGCTAGACGAACAGCCCTCGCGATGGGAGACAAGCCGGGAGAT 


241 


ValGlyAspLeuCysGlySerValPheLeu 

CGTGGGGGACTTGTGCGGGTCTGTCTTTCTTG 

GCACCCCCTGAACACGCCCAGACAGAAAGAAC 


FIGURE 9 



EP 0 388 232 A1 


Zria«l4tlce of ena cxaica 

1 


61 


121 


101 





241 


.301 


T9QTTCAC0CSnSM0T6CCC0QAAAT9Q S9GAGTGGTTACTAAC99GATT9MC2CAT 


'lip V 

!liUl«Al«A*pAlil2UL*uHi*ThrProGlyC7*VtiProCT*V»lJkM6lu 
;??^ WCTSCAC * CTCC ® 0 ®® T OCOtCCCrWCOTT«?O t 


with aatib- 


361 


GlyA«aAl*S«rArgC7*TrpV*L»l*K*tThrProThrV*Ua» 


FIGURE 10 
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Traa«l*tian of oka CA 290 a 
i 

nmrntiotTisainsresnBse»KsSioSenKiS§5S5^cc«cecc 


in 


241 


Thr 



SerCln n 


InproilwroLysAlA 

xicc 


GCTOCTCXTTCTGAMOCTCGCCWCOlTOGAOCTC&rCTOCOG^GWTXSGGGTTCC 

181 ST^ilg^lSSScl^SSGCTgigcggigasggSSSS^^g^*" 




361 


3 d •^^ssasBffis&^asBffigasss^ 

CSGOSWTCXGGGCGCCGCAICCAfiCGCGITAAACCCATTCCACTACCrASGCCAAfGCA 


SlTTMAl 



;^w^lrrY ri iorroLo uv .iciyMifroLeu81y<llTAltAi. 

CCC Cg AAgCS<iCTGCACtACCCCATaiAgOQC fl A flr » nrpnmnng AflAAgeTCC g<?g AC 


'Overlap with CA 21 <« 


431 


ArgAl aLeu AitHHClyvaLArgvimuGluMpciyv«iA«aTTgiaaxhrClyAin 


LauPraClyCyaterPhoSerThrFItt 
461 ACCTtCCW OT T O CtCTCTCT C tACCTTC 
TGGAAGGACCMCGAGMAGIGATGGMG 


v 


FIGURE 11 
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Translation of DMA * 930 * 


IMctSerValValClsProPraClyProProLau 

MecAlaLeuYalOP 

1 aOCAGXAACCCTCTACCCATCCCCTTACTATGAGTGTCGTGCAGCCTCCAOGACCCCCCC 

ocgtcmtcgcwatcggiaccocaatcatactcacagcaostcggagg tcctggggggg 


proGlyGluProAH 


6 1 TCCCCCCACACCCATACTCCTCTGCGG AACCCCTCACTACAC0GGAA7TGCCAGGACGAC 

agggccctctoggtatcaccagacgccttggccactcatgtggccttaacggtcctgctg 


fKetProGlyAspLeuGlyvalProrroGlnAAp 
131 CGGGTCCTTTCTTGGATCAACCCGCTCAATGCCTOGAG AT7TGGGCGTGGCCCCGCAAG A 

gcccaggaaagaacctagttgggcgagttacggacctciaaacccgcacgggggcgttct 


OP AN GlrAlaCra 

CyrAM ■ 

181 CTGCTAGCCGAGTAGTGTTGGGTCGCGAAAGGCCTrOTGGTACTGCCTGATAGGGTCCTT 
GACGATCGGCTCATCACAACCCAGCGCTTTCCGGAAC&CCATCACCCACTATCCCACCAA 

I 

GluCyoProGlyArgSarArgArgProCyaThrHetSerthrAoaPraLysProGlaLya 


241 GCGAGTGCCCCGCGAGGTCTCGTAGACCGTGCACCATGAGCACGAATCCTAAACCTCAAA 
CGCTCXCGGGGCCCTCCAGAGCATCTCCCAOCTGGTACTCGTGCTTAGCATTTCCACTTT 


LyaAanLyaArgAanThrAanArgArgProGlnAnpValLyaPheProGlyGlyGlyGln 


301 aaaaaaacaaacgtaag\ccaaccgtcgcccacaggacgtcaagttcccgggiggcggtc 
tttttttctttccattgtgcttggcaccgcctgtcctccacttcaaccgcccacccccag 


IloV«lGly«lyvalTyrl»uI*i*ProArgArgGlyProArgl<eu01yValArgAlaThr 


361 agatcgtiggtggagtttacttgttgccgcgcaggggccctagattgggtgtgcgcgcga 

TCTACCAA C CAC C TCAAA1CAACAACGCCCCCTCCCCCGCATCTAACCCACAC6CCCGCT 


ArgLyaThr3erGluArgSerGlnProArg01yArgArg0inProIl*ProLysAlnArg 


4 21 CGAGAAAG ACTTCCGAGCGGTCGCAACCTCGAGG TAG ACGTCAGCCT ATCCCCAAGGCTC 
CCTCTTTCTCAAGGCTCGCCACCGTTGCACCTCCATCTGCACTCCCAIACOCCrTCCCAC 


ArgProGluOlyArgThrTrpAlaGinProGlyTyrProTrpProLouTyrGlyAanGlu 


481 


>— Overlap vlth 

GTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATG 

■1AGCCGGCCTCCCGTCCTGCACCCCAGTCGGGCCCATCCGAACCGGCCACATACCCTTAC 




FIGURE 12 -1 
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Clj<ij*Gl7*rpAl*GlyTrpLeuLeuSerProArjGlySerArgProS«rTrDGlyPro 


541 


raSaSWSCCClCCCCCCCTACCCACCACAGMGGGCACCGASXGCCCGATCOACCCCGO 


Tr* rAepProArgArgAjcgSerArgAsnLeuG lyl»y5 va 1 II eAs pThrLeuThr Cy sG ly 


601 CCACAG ACCCCCCGCGTAGGTCGCGCAATTXCGGTAAGCTCATCCATACCCTtACCTCCG 

gctgtcxcccgcccgcatccagcgcgttaaacccattccagtagctatcggaatgcacgc 


Phe 


$61 GCTTC 

ccaag 


• • Start of long HCV ORF 

| • Putative fira-t amino sold of large HCV polyprotein 
i - Putative snail encoded peptides ( that may play a 
translational regulatory role) 


► 


FIGURE 12-2 
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translation of on CA20Sa 

vaiZMaSXrArvGluArTfroCrafilynirAlaOP AN <21yJUaCyaCluCysPrefily 
1 CTCTTCGCtOSOSAaAflOCCIPr QWtf r A CtCCeWATAGOQT Q CTTgCOAflTOCCOCOGO 


61 JU»TC7C0TA«CC0TCCA0CATGASCACSAATCCTMACCTCAAASAAAAACCAAACGT 


121 


AanThrAanAr?Ar?ProQlaAspValL7sPh*Pre01yClYOlr01aIl«Vtl0lT0lT 

AACACCAACCGTCGCCCACAGGACGTCAAGTICCCGGGTGGCGGTCAGAtCCTTQGTGGA 

TTCTCCTTCCOgCgCeKXCCICCACTTCAACCOCCCACCCCCAflTClAflCAACCACCT 


valTrrbeulcuPr 
181 GTTTACTTGTTGCC 



241 



Ar uCu.' T rc AlaglnrroglyTyrProTrpProLauTyrGlyAJnGluClyCyi 
301 juKACCTCCarrCACCaWOCTACCCTTCCCCCCTCTAtCCCAAXCJlMeeTCCC 
rCCTOOACCOSAfltOOOOCCCAiqOOAACCOOOQAaATACCCTtACTCC C CA fWC 


• - putative initiator aethlonlne eodoo 




EP 0 388 232 A1 


Iraaalaticn of DStt 18 5 


*ProProOP 

±??5F l ^^3£ s !: Hi,5erProVaiAr9AsnT y rC y sI ‘ e ' iMisA i aClu s«v»i*x 

*^^^ a ^ ,a ^ u ^ er ^ euPr °C y»QluGIui.ff»'LeuS«r3egAgqAraI.T«AygT ^ | 

^y^^^^^^CGA^CTAC^TCTTCACCCJU^UUCgCTrTi^rr 
CAG^TCCTACJTACXCAGQCGACACtCCtTOAICJ^AflAACTG O ar e rwii 


€1 


m «tS*rV alV alClnPrcProfllyProProLauProGlyCluProAM 
MetAlaLeuVdlCP 

ATCXSCGTTAGTATGAGTCTCGTGCAKCTCCAGGACCCCCCCTCCCGGGAGAGCCATAGT 

TACCGC?ATCATACrcACAGCACGrCGGAGGTCCTGGGGGGGASG3CCCTCTCGCTATai 


121 GGTCrGCGCAACCGGTGAGTACACCGGAArTGCCAGGACCACCGGGTCCTTTCTTGGAXC 
CCAGACGCCrTGGCCACTCATGTCGCCXrAACGGICCTGCTGGCCCAGGAAAGAACCIAG 


Overlap with ag3Qa 

♦MetP rcGlyA*pLeuGlyYa IPrcProG LnAs pCysAX 

181 AACCCGCTCAATGCCTGGAGATTTGGGCGTGCCCCCSCAAG ACIGCTAGCCGAGTAGTCT 
TTCCCCSAfirr ACCCACCTCTAAACCCCCACCCCCCCG TTCTCACG AXCCCC XCATCACA 


OP AH GlyAlaCyaGluCyaProGlyArgSor 

IT 


24 X TGGGTCGCGAAAGGCCTTGTGGTACTCCCTGAIAGGGTGCTTGCGAGTGCCCCGGGAGGT 
ACCCAGCaCTTTCCGGAACACCATGACGGACrAKCCACGAACGCTCACeCGCCCCTCCA 


ArgArg 

201 CTCGTAGA 
CACCATCT 


* - Start of Iona HCY ORF 

% - Putative small encoded peptides (that may play 
a translational regulatory role} » 


FIGURE 14 
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Translation of DNA 16jh 
“ —— Overlap with 15e — — 

GlyAlaC7sTyrSerIleGluProLeuAspLeuProProIlelleGlnArgLeuHisGly 
1 GGGGCCTGCTACTCCATAGAACCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGC 
CCCCGGACGATGAGGTATCTTGGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCG 

LeuSerAlaPheSerLeuHisSerTyrSerProGlyGluIleAsnArgValAlaAlaCys 
61 CTCAGCGCATTTTCACTCCACAGTTACTCTCCAGGTGAAATTAATAGGGTGGCCGCATGC 
GAGTCGCGTAAAAGTGAGGTGTCAATGAGAGGTCCACTTTAATTATCCCACCGGCGTAOG 

Gly* 

G 

LeuArgLysLeuGlyValProProLeuArgAlaTrpArgHisArgAlaArgSerValArg 
121 CTCAGAAAACTTGGGGTACCGCCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGC 
GAGTCTTTTGAACCCCATGGCGGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCG 

AlaArgLeuLeuAlaArgGlyGlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrp 
181 GCTAGGCTTCTGGCCAGAGGAGGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGG 
CGATCCGAAGACCGGTCTCCTCCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACC 

AlaValArgThrLysLeuLys 
241 GCAGTAAGAACAAAGCTCAAAC 
CGTCATTCTTGTTTCGAGTTTG 


* - nucleotide heterogeneity 


* 


FIGURE 15 
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COMBINED ORF OF DNAs pil4a THROUGH 15e 

(pU4a/CAI67b/CAI56«/CA84a/CA59a/K9-l/12f/Ui/iib/7f/7e/ 
8h/3 3c/40b/37b/3 5/36/81/3 2/3 3b/25e/14e/8f/33f/33g/39e/ 

35£/19g/26g 6 15a) 


TCCAGCGCGTTAAACCCATTCCAGTAGCTATGGGAATGCACGCCGAAGCGGCTGGAGTAC 
«l OoScATiCCGMCCK^CCCCCTOTK^CKTOCMMCCCIMCCaTOC 

cccAwi»iMewc»cKCiKoi^»iMccrascSc«wccisSccseoiiccs 

aawccAMMciKKceGCJuraM^^ 

181 

301 

isi ^^^KaaBssaBaasasBaasias 

. CGGAGCTCCACAACCCACCGCTACTGGGCATGCCACCG6TGGTCCCTACCGTTTGAGGGG 
CGCTGCGTCGAAGCTGCAGTGTAGCTAGACGAACAGCCCTCGCGGTGGGAGACAAGCCGG 

* 9 1 J5r5r5Xwi^?2^ uCysGlyS ® rV * lpfleteuV * lsl y Gltl£ '* u? : heThrPheSer 

4 81 CTCT^G^WGACCTATGCGGGTCTGTCTTTCTTGTCGGCCAACTGTTCACCTTCTCT 
GAGATGCACCCCCTGGATACGCCCAGACAGAAAGAACAGCCGGTTGACAAGTGGAAGAGA 

541 rrriftrwrf^2T55?^J n 5 lyCysAspCysSerIleT y rProG1 y HiaIle ‘ rhr 
5 ’ 1 CCC AGG CGCCACTGGACGACGCAAGGTTGCAA.TTGCTCTATCTATCCCGGgrAT&TAarr: 

GGGTCCGCGGTGACCTGCTGCGTTCCAACGTTAACGAGATAGATAGGGCCGGTATATTGC 

am rr^*^^2^25^i?^ pA3pMetMetMetAsnTr P SerProTJirrhrAlaL « uVai - vlct 

601 rrs^^^TSSS^?5^ ATATGATGATGAACTGGTCCCCTACGACKCGTT GGTAATG 

CCAGTGGCGTACCGTACCCTATACTACTACTTGACCAGGGGATGCTGCCGCAACCATTAC 


6«1 ^^^^^^CCGGATCCCACAAGCCATCTTGGACATGATCGCTGGTGCTCACTGGGGA ^ 
CGAGTCGACGAGGCCTAGGGTGTTCGGTAGAACCTGTACTAGCGACCACGAGTGACCCCT 

i ■» t 1 aG ^y IleAlaTyrPheSerMetValGlyAanTrpAlaLyjValLauValVal 

721 GTCCTGGCGGGCATAGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGTAGTG 
CAGGACCGCCCGTATCGCATAAAGAGGTACCACCCCTTGACCCGCTTCCAGGACCATCAC 

, I^BuLauLeuPheAlaGlyValAapAlaGluThrHlsValThiClyGlySerAlaGlyHls 
781 CTGCTGCTATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAGTCCCGGCCAC 
GACGACGATAAACGGCCGCAGCTGCGCCXTTGGGTCCAGTGGCCCCCTTCACGGCCGGTG 
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ThrValS rGlyPheValSerX.euLeuAlaProGlyAlaX.ysGlnAsnValGlnCeuIle 
841 ACTGTGTCTGGATTTCTTAGCCTCCTCGCACCAGCCGCCAAGCAGAACGTCCAGCTCATC 
TGACACAGACCTAAACAATCGGAGGAGCGTGGTCCGCGGTTCGTCTTCCAGGTCGACTAG 

AjnXhrAjnGlySarXrpHisLauAanSerThrAlaLauAjnCysAfnAspSarLauAsn 
' 901 AACACCAACGGCAGTTGGCACCTCAATAGCACGGCCCIGAACTGCAATGATAGCCICAAC 
XTGXGGXXCCCGXCAACCGIGGAGTIAXCGIGCCGGGACITGACGTXACIAICGGAGTTG 

XhrGlyXrpLeuAlaGlyLauPheXyrHisHiJLyaPheAsnSerSarGlyCysProGlu 
961 ACCGGCTGGTTGGCAGGGCTTTTCTATCACCACAAGTTCAACTCTTCAGGCTGTCCTGAG 
TGGCCGACCAACCGTCCCGAAAAGAIAGXGGTGXTCAAGTXGAGAAGTCCGACACGACTC 

ArgLeuAlaSarCy3ArgProZ,euXhrAspPheAspGlnClyTrpGlyProIleSerIyr 
1021 AGGCTAGCCAGCTGCCGACCCCTTACCGATTTTGACCAGGGCT'GGGGCCCTATCAGTTAT 
TCCGATCGGTCGACGGCTGGGGAATGGCTAAAACTGGTCCCGACCCCGGGATAGTCAATA 

AlaAsnGlySerGlyProAspClnArgProXyrCysXrpHlsXyrProProLyaProCys 
1081 GCCAACGGAAGCGGCCCCGACCAGCGCCCCTACTGCTGGCACTACCCCCCAAAACCTTGC 
CGGTTGCCTTCGCCGGGGCTGGTCGCGGGGATGACGACCGTGAXGGGGGGTTTTGGAACG 

GlylleValProAlaLysServalCysGlyProValTyrCysPheThrProSerProVal 
1141 GGTATTGTGCCCGCGAAGAGTGTGTGTGGTCCGGTATATTGCTTCACTCCCAGCCCCGTG 
CCATAACACCGGCGCTTCTCACACACACCAGGCCATATAACGAAGTGAGGGTCCCGGCAC 

ValvalGlyXhrXhrAapArgSerGlyAlaProXhrTyrSarXrpGlyGluAanAjpXhr 
1201 GTGGTGGGAACGACCGACAGGTCGGGCGCGCCCACCXACAGCTGGGGTGAAAAIGATACG 

CACCACCCTTGCTGGCTGTCCAGCCCGCGCGGGTGGATGXCGACCCCACTTTTACTATGC 

• * 

AspValPheValLauAjnAsnThrArgProProLeuGlyAsnlrpPheGlyCysXhxXrp 
1261 GACGTCTTCGICCIIAACAAXACCAGGCCACCGCTGGGCAATTGGTTCGGTTGIACCTGG 
CTGCAGAAGCAGGAAITGTTAXGGICCGGTGGCGACCCGTTAACCAAGCCAACAIGGACC 

HetAsnSarrhrGlyPheThrLysvalCysGlyAlaProProCysVallleGlyGlyAla 
1321 AXGAACICAACXGGAIXCACCAAAGTGTGCGGAGCGCCXCCTXGXGXCAICGGAGGGGCG 
XACXTGAGTTGACCTAAGIGGTXXCACACGCCXCGCGGAGGAACACAGIAGCCXCCCCGC 

, GlyAsnAsnXhrLeuHisCysProXhrAspCysPheArgLysHisProAspAlaXhrXyr 
1381 GGCAACAACACCCXGCACTGCCCCACTGAXTGCTXCCGCAAGCAICCGGACGCCACAIAC 
CCGXTGXXGTGGGACGXGACGGGGXGACTAACGAAGGCGXXCGTAGGCCXGCGGXGXAXG 

SerArgCysGlySarGlyProXrpIleXhrProArgCysLauValAspXyrProXyrArg 
1441 XCICGGXGCGGCICCGGTCCCTGGAICACACCCAGGTGCCXGGICGACIACCCGXAIAGG 
AGAGCCACGCCGAGGCCAGGGACCXAGXGTGGGTCCACGGACCAGCXGAXGGGCAXAXCC 

LauXrpHisXyrProCysXJurlleAsnlyrXhrllePheLysIleArgMetXyrV’alGly 

1501 CrrTGGCATTATCCTTGTACCAXCAACTACACCAIAXTTAAAATCAGGATGXACGIGGGA 

GAAACCGTAAXAGGAACAXGGTAGXXGATGXGGXAXAAATTTXAGXCCXACAXGCACCCX 

* 

GlyValGluHlsArgLeuGluAlaAlaCysAsnXrpXhrArgGlyGluArgCysAspLeu 
1561 GGGGXCGAACACAGGCTGGAAGCXGCCXGCAACTGGACGCGGGGCGAACGXTGCGAXCXG 
CCCCAGC7XGXGXCCGACCXXCGACGGACGTTGACCXGCGCCCCGCXTGCAACGCXAGAC 

GluAspArgAspArgSerGluLauSarProLeuLeuLeuXhrThrXhrGlnXrpGlnVal 
1621 GAAGACAGGGACAGGXCCGAGCXCAGCCCGXTACTGCTGACCACXACACAGTGGCAGGTC 
CTTCTGXCCCXGXCCAGGCTCGAGTCGGGCAAXGACGACTGGTGATGXGTCACCGXCCAG 

LauProCysSerPhaThrThrLauProAlaLeuSarThrGlyLettlleHistauHisGln 
1681 CTCCCGTGrrCCTTCACAACCCXACCAGCCTTGTCCACCGGCCTCAXCCACCTCCACCAG 
GAGGGCACAAGGAAGTGXTGGGAXGGXCGGAACAGGTGGCCGGAGXAGGTGGAGGXGGXC 

AsnlleValAspValGlnTyrLeuTyrGlyValGlySerSerlleAlaSerTrp Alalia 
1741 AACAXTGTGGACGTGCAGTACTTGTACGGGGTGGGGTCAAGCAICGCGTCCTGGGCCATT 
TTGTAACACCTGCACGICAIGAACATGCCCCACCCCAGITCGTAGCGCAGGACCCGGIAA 

LysXrpGluXyrValValLauLeuPheJlauLeuLauAlaAjpAlaArgV’alCysSarCys 
1801 AAGTGGGAGTACGICGTXCXCCXGXTCCXTCXGCTXGCAGACGCGCGCGXCTGCTCCTGC 

9 
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ttcaccctcatgcagcaagacgacaaggaagacgaacgtctccgcgcgcacacgaggacg 

1861 m^ATGATG^AC^TATCCCA^CGA^^r^^ G1UAlnL * uValI1,L<,u 

AACACCTACTACGAXGAGTAXAGGGTTCGCCICCGCCGAAACCTCITGGAGCAtTATGAA 

lS!l 


!0U 

8101 

2161 SSG«cSlS«GCTlGTCCTOS^CTxS nWh * t •uThrArgValGlu 

*witcscoA»rMico»ec»cS»SccSeSl5^iISJiiEeisireiSccn 

222i ^S^lS^? ^5i5^^ °y§i*f^i?giy*CT*.p>i.v,i I i. 

CGCGTTGACGTGCACACCTMfflWGGGGAGTTCCAGGCTCCCCCCGCGCTGCGfiCAGTAG 

2281 r?WTCM5TO?«TOlScCcS«TCTGSATTT5Sii?Sj tl,,l * ul * UL *“ A1 * 

ValPhAfiTvDpAf^it««M>fi »« _ 

2341 



2401 


1-WlyHl 


»asss§sSi§i^ 

““ SsiiSsiSsl^«BS 


“ ssSSSSssasilg 
SSSHSSS 

-"Sia^SsSSI^S 

2201 

CGGCTACCXTACCAGAGGTTCCCCACCTCCAACGACCGCGGGXAGTGCCGCAtGCGGGTC 

2751 ^^^^^sss^sgsessEsas: 

TT 
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Y^*;f$*>51uValGlnIleVal5arthrAlaAlaGlnThrPhaI.euAlaThrCysIle 
2821 GTGGAGGG TGAGG rCCAGATTGTGTCAACTGCTGCCCAAACCTTCCTGGCAACG TGCATC 
CACCTCCCACTCCAGGTCTAACACAGTTGACGACGGGTTTGGAAGGACCGTTGCACGTAG 

TXACCCCACACGACCXGACAGAXGGIGCCCCGGCCXXGCXCCXGGXAGCGCAGIGGGXIC 

»« SCSSS^l^S^SS^3^S^SSSSSSSSSS& 

CCAGGACAGTAGGTCTACATATCGTTACATCTGGTTCTGGAACACCCGACCGGGCCXGGC 


30.3 

GTTCCATCGGCGAGTAACTGTGGGACGTGAACGCCGAGGAGCCTGGAAATGGACCAGTGC 

ArgHisAlaAspVallleProValArgArgArgGlyAjpSarArgGlySarLauLauSer 
3061 AGGCACGCCGAIGTCAIXCCCGTGCGCCGGCGGGGTGATAGCAGGGGCAGCCTGCTGTCG 
TCCGTGCGGCTACAGTAAGGGCACGCGGCCGCCCCACXATCGTCCCCGICGGACGACAGC 

, , . , jP5°fi®S«rTyrL«m.ysGlySerS«rGlyGlyProLeuLeuCy5ProAlaGly 

3121 CCCCGGCCCAXXXCCXACXXGAAAGGCICCXCGGGGGGXCCGCXGITGXGCCCCGCGGGG 
GGGGCCGCGTAAAGGATGAACTTTCCGAGGAGCCCCCCAGGCGACAACACGGGGCGCCCC 

hot ^™^iiSi^ IlePheAr?AlaAiaValc y 3ThrAr 5GlyValAlaLy3AlaValA3p 

3181 CACGCCGTGGGCAIATTTAGGGCCGCGGTGTCCACCCGTGGAGIGGCTAAGGCGGTGGAC 
GTGCGGCACCCGTATAAATCCCGGCGCCACACGTGGGCACCTCACCGATTCCGCCACCTG 

, £Jj®J^*^oValGluAanLauGluThrThrMetArgSerProValPheThrAspA 3 nSer 
3241 TTTATCCCTGIGGAGAACCTAGAGACAACCATGAGGTCCCCGGTGTTCACGGATAACTCC 
AAATAGGGACACCXCIXGGAICICIGXTGGTACTCCAGGGGCCACAAGIGCCXATXGAGG 

SarProProValValProGlnSarPheGlnValAlaHisLeuHisAlaProXhrClySar 
3301 TCTCO^CAGIAGTGCCCCAGAGCTTCCAGGTGGCTCACCTCCAIGCTCCCACAGGCAGC 
AGAGGTGGTCATCACGGGGTCTCGAAGGTCCACCGAGTGGAGGTACGAGGGTGTCCGTCG 

, , , , J ? ®5 r JjrI«y* Va lP r cAlaAlaTyrAlaAlaGlnGlyTyrLy*ValL*uValLau 

3361 gGC AAAACC ACCAAGGXCCCGGCTGCAXATCCAGCXCAGGGCXATAACGTGCTAflTAeTC 

CCGXXXXCGIGGXXCCAGGGCCGACGTATACGXC 6 AGXCCCGAXAXTCCACGATCATGAG 


, - ... AsnProSarValAlaAlaThrLeuGlyPheGlyAlaTyrMetSarLysAlaHisGlylle 
3421 AACCCCTCTGTIGCTGCAACACTGGGCXXIGGTGCIXACAIGICCAAGGCICATGGGAIC 
ITGGGGAGACAACGACGXTGTGACCCGAAACCACGAAXGTACAGGIICCGAGXACCCTAG 


AapProAsnlleArgThrGlyValArgXhrlleThrThrGlySerProIleThrTyrSer 
3481 GATCCTAACATCAGGACCGGGGTGAGAACAATIACCACTGGCAGCCCCATCACGTACXCC 
CXAGGATXGTAGTCCTGGCCCCACXCXXGTXAATGGTGACCGXCGGGGTAGXGCAIGAGG 

XhrTyrClyLysPheLauAlaAapGlyClyCysSerGlyGlyAlaXyrAspIlellaXla 
3 S 4 1 ACCXACGGCAAGTXCCXTGCCGACGGCGGGXGCXCGGGGGGCGCTXAXGACAXAAXAATT 
XGGAXGCCGXXCAAGGAACGGCIGCCGCCCACGAGCCCCCCGCGAAXACXGXAXXAXXAA 

CysAspGluCysHlsSarThrAjpAlalhrSarlleLauGlylleGlyThxVall^uAsp 
3601 XGXGACGAGTGCCACXCCACGGATGCCACAXCCAXCXXGGGCAXCGGCACXGXCCXXGAC 
ACACXGCICACGGXGAGGXGCCTACGGTGTAGGTAGAACCCGXAGCCGTGACAGGAACTG 


GlnAl aGl ulhr A1 aGlyAl aAr gLeuvalValLauAl aThrAl aThr ProProGlySar 
3661 CAAGCAG AGACIGCGGGGGCGAGACIGGTXGTGCICGCCACCGCCACCCCTCCGGGCXCC 
GTTCGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGG 

ValThrValProHlsProAjnlleGluGluValAlaLauSerThrXhrGlyGluIlePro 
3721 GXCACXGTGCCCCAXCCCAACAXCGAGGAGGXXGCXCXGXCCACCACCGGAGAGAXCCCX 
CAGTGACACGGGGTAGGGTTGXAGCICCTCCAACGAGACAGGTGGTGGCCXCXCXAGGGA 


PheXyrGlyLysAlaZleProLeuGluVallleLysGlyGlyArgHlstauXlaPhaCys 
3781 XTXXACGGCAAGGCXAXCCCCCXCGAAGXAAXCAAGGGGGGGAGACAXCXCAXCTTCXGX 


« 
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AAAATGecGTTCCGArAGGGGGttJCTTCATTAGrrCCCCCCCTCTGTAGAGTAGAAGACA 

SJSSESrftriSSflsiu-uiauau...^ -- 


3841 



sssssss^ 

GTGSCCTAr^^^Hiiif^fPY^lSarValil, 



3961 GT<KCAACCGat^^^S 5?!J£S® rThr«l: 



4081 


4141 


vumSS&SSSi SSS 

SiE€Hi3ili siisTSi 



" 01 



44,1 OT^GCT^JcCT????Src^fS;5v l : tTr P L '-' 4 Cy=L« u u. Jlr ,^ u 

CGATCCCGACnCGGGGAGOKGIAGeACC^^ ^^rTTSi??;; ;^ 

G L! ; £ o^uu„UG l7PMtaPr ^ iV4icinijMiu 

SSSSSSSSSSS3gggg$ 

4 s 6 1 ATCACCCTGA^^CCCAG TCArr i ai^^? leMe tTnrC y sHetS erAl aAs pLeuGlu 

^^^cgigggicmkgiitaigtSS ^?^;^;-^ 

GTCGTCKGWaScT^TGCTCGTTGfirnr?^ii L * UAlaA1 * L « UAl4 *i*IyrCyi 

aGCAGTOCICGrGGACCCAaSSGraSeScreJmlSSSSr^' 


4621 


4681 


:gccgca^ccgac^^^ 

-sProAlalle 
iGCCGGCAATC 
JCCCTTCGGCCGTTAG 

4741 ATACCTGACAclGAAGTCCTCTACCGf^arT^^ 3pG *' uMe t c ’3uGluCysSerGlnHia 
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4 SOI 

MTQSCATOXMeTCOTtCCCTXCTiCSWCMCraTCMSIT^^iScSSre 

GAGGACCTCTWraKaoaK^TCC6ICICCAATA«GGG^GACWCICTCCITGACC 

«» SSXSE3SS^SSSSiaBSS8S^S3SSS£^SaE 

GITrTTGAGCTCTSGAAGACCCGCTTCCTATACACCXTiAA4TA£TCACCCXATGTTATC 

4981 

AACCGCCCGAACAGTTGCGACOGACCArTGGGGGGGTAACQAAGTjScTACCGAAAATGT 


5041 ssasagKsssBBSBggg^^ 

cgacgacagtggtcgggtgattggtgaicggtttgggaggagSgttgtatZaccccccc 

5101 SSSS&ggS 

ACCCACCGACGGGTCGAGCGGCGGGGCCCACGGCGATGACGGAAACACCCGCGACCGAAT 

5161 ^^®sassfiBggsxM 

CGACCGv-GGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCC 

5571 Ir^^^-yY-^-^y Ala I ^ uVr alAlaPhftL y 3 IleMetSerGlyGluValPro 

■ 2!1 Sigggggggggg^ ^S^j^ ^SSSSS S S )}^ 

53.1 ^^^^^^225SSSS5SSgSSS5^SS2iS?gSiSi 

AGGTGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAG 

51.1 

CCGCACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTC 


. „ H2 M S5 AanAr9LeuIleAlaPh ® AlaSerAr 9 G1 y AsnH l' 3V *l-SerProThrHisTyT 

5401 TGGATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGXTTCCCCCACGCACTAC 
ACCTACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATG 

e . valProGluSerAspAlaAlaAlaArgVallhrAlalleLeuSarSerLauThrValThr 

5461 GTGCCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACXGXAACC 
CACGGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGAGATTGG 

» 

GlnLauLeuArgArgLeuHisGlnXrpIleSerSerGluCysXhrXhrProCysSerGly 
5521 CAGCTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGT 
GICGAGGACICCGCXGACGXGGICACCIAIICGAGCCXCACAIGGIGAGGIACGAGGCCA 


SerTrpLauArgAsplleTrpAspTrpIleCysGluValLauSarAapPheLyaThrTrp 
5581 TCCTGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGG 
AGGACCGATTCCCXGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACC 

LauLysAlaLysLeuMetProGlnLauProGlylleProPhaValSarCyiGlaArgGly 
5641 GXAAAAGCXAAGCXCATGCCACAGCTGCCXGGGAXCCCCXXXGXGXCCTGCCAGCGCGGG 
GAXXXXCGAXXCGAGXACGGXGXCGACGGACCCXAGGGGAAACACAGGACGGXCGCGCCC 


XyrLysGlyVallrpArgValAspGlylleMetHisXhrArgCysHisCysGlyAlaGlu 
5701 TATAAGGGGGXCTGGCGAGTGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAG 
AXAXXCCCCCAGACCGCXCACCXGCCGXAGXACGXGXGAGCGACGGXGACACCXCGACXC 

IleXhrGlyHisValLysAsnGlyXhrMetArgZleValGlyProArgThrCysArgAsa 
5761 AXCACXGGACAXGXCAAAAACGGGACGAXGAGGAXCGXCGGXCCXAGGACCTGCAGGAAC 
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5821 

4 

5881 

5941 

6001 

6061 

6121 

6181 

6241 

6301 

6361 

6421 

6481 

6541 


TAG rGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACG TCCTTG 

XACACCXCACCCXGGAAGCGGIAAXXACGGAIGXGGXGCCCGGGGACAIGGGGGCAAGGA 

CGCGGCIXGAIGXGCAAGCGCGAXACCXCCCACAGACGXCXCCXXAXACACCXCXAXXCC 

GICCACCCCCIGAAGGXGATCCACIGCCCAIACXGATGACIGXXAGAGIXIACGGGCACG 

GXCCAGGGXAGCGGGCIIAAAAACXGXCIXAACCIGCCCCACGCGGATGIAXCCAAACGC 

<OT<^AMIICG«i*CO i C5CCCICCIcS?raIia^^gg^gJI“ 

GGCCATCCCAGCGTXAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGG7AC 

GAGTGACXAGGGAGGGXATAXXGICGICXCCGCCGGCCCGCXICCAACCGCICCCCIAGX 

TCCTwf?r??^^S^^»^^ SpGluArgGluIleSerV ' alJ,roAia GluIl« 

i^?F£SS?2:SSTTS TGGCGGAG<3AGGACGAGCGGGAGA TCXCCGIACCCGCAGAAAXC 

AGGAAGCTAGGCGAACACCGCCICCXCCTGCXCGCCCTCXAGAGGCATGGGCGTCXXXAG 


■ ssssssasssassssssssssEsgSgs^ss 

*•« eSgSlSSE^ESS^S£SS%SSSaB%^ 

GAGXGACIXAGXXGGGAIAGATGACGGAACCGGCXCGAGCGGXGGXCXXCCAAACCGXCG 


6721 


SerSerXhrSexGlyllelhiClyAspAsnlhrThrXhrSarSarGluPraAlaProSer 

XCCXCAACIXCCGGCAIXACGGGCGACAAIACGACAACAICCXCXGAGCCCGCCCCXXCX 

AGGAGXXGAAGGCCGXAAXGCCCGCXGIIAIGCIGITGTAGGAGACXCGGGCGGGGAAGA 
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6731 

6841 

6901 

6961 

7021 

7001 

7141 

7201 

7261 

7321 

7381 

7441 

7501 

7561 

7621 

7681 

7741 


GGCTGCCCCCCCGACTCCGACCCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAG 

CCGACCGGGGGGCIGACGCIGCGACTCAGGAXAAGGAGGIACGGGGGGGACCXCCCCCXC 


CKCCCCTICmrCITr<aCOKIWTr»CCT^TICCrTOA5cIIcO*T5CAOTCCtO 

XXAAACCACAIAAGCXGGIGGAGIGCGICACGAACGGIXXCCGXCXXCXTXCAGIGXAAA 

A fP^f^SlnyalI^uAspSerHlaXyrGlnAspValLcuLysGluValI.y3AlaAla 

5™^^^5HST55 A ^ GCCATTACCAGCAC2TACTCAAC;GAGGTTA A AGCA GCG 

CTGTCTGACGITCAAGACCTGTCGGTAATGGICCTGCATGAGXTCCXCCAATXTCGTCGC 

CGCAGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGT 

GIGAGICGGITTAGGIXCAAACCAAIACCCCGXXXXCTGCAGGCAACGGXACGGXCXXXC 

HiY^ ThrHi * IleA * nSerValTr P L y* As P LauL « u 61uAspAsaValXhrProIle 

gg^ ^ CCACATC^CTCCGXGXGGAAAGj^eCTTCXGGAAGACAATGTAACACCAATA 

CGGCATTGGGTCXAGTXGAGGCACACCIXTCTGGAAGACCTTCTGITACAITGTGGIXAT 


CTGTGAXGGXAGTACCGATTCTXGCTCCAAAAGACGCAAGXCG6ACTCXXCCCCCCAGCA 



LysProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMetAla 

AAGCCAGCTCGXCTCATCGTGXTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCX 

XXCGGXCGAGCAGAGXAGCACAAGGGGCXAGACCCGCACGCGCACACGCXXXTCXACCGA 


LeuXyrAspValVallhrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGln 

TTGTACGACGTGGTTACAAAGCXCCCCTTGGCCGXGAXGGGAAGCXCCTACGGATTCCAA 

AACAXGCXGCACCAAXGXXTCGAGGGGAACCGGCACXACCCTTCGAGGAXGCCTAAGGXT 


XyrSerProGlyGlnArgValGluPhaLeuValGlnAlaTrpLysSerLysLysThrPro 

XACXCACCAGGACAGCGGGTTGAAXXCCXCGXGCAAGCGTGGAAGXCCAAGAAAACCCCA 

AXGAGXGGXCCXGXCGCCCAACXTAAGGAGCACGXXCGCACCTTCAGGTTCTTTTGGGGT 

MatGlyPheSarXyrAapXhrArgCysPheAjpSarXhrValXhrGluSarAjpIleArg 

ATGGGGXTCXCGTAXGAXACCCGCXGCTXXGACXCCACAGXCACTGAGAGCGACAXCCGT 

XACCCCAAGAGCAXACXAXGGGCGACGAAACTGAGGXGTCAGXGACTCXCGCXGTAGGCA 


XhrGluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgValAlaXleLys 

ACGGAGGAGGCAAXCTACCAAXGTXGXGACCXCGACCCCCAAGCCCGCGXGGCCAXCAAG 

XGCCTCCXCCGXXAGAXGGTXACAACACTGGAGCXGGGGGXTCGGGCGCACCGGTAGTTC 

SerLeuXhrGluArgLeuTyrValGlyGlyProLeuThrAjnSarArgGlyGluAsnCys 

ICCCTCACCGAGAGGCXXXAXGITGGGGGCCCXCITACCAAXTCAAGGGGGGAGAACTGC 

AGGGAGXGGCXCXCCGAAATACAACCCCCGGGAGAAIGGXXAAGXXCCCCrCTCTTGACG 

* 

GlyTyrArgArgCysArgAlaSerGlyValLauIhrXhrSarCysGlyAsnThrLauThr 

GGCXAXCGCAGGXGCCGCGCGAGCGGCGXACIGACAACIAGCIGTGGXAACACCCTCACX 
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CCGATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGA 


7801 

ACGAXGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGICCTGACGTGGTACGAG 


7861 


«SSSSasaa^SK5S5g^sag sS«M5ggagiaa>a^ 

CACACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCtCCTGCGCCGC 

79 2l sssssScOTcSgggScS^sasagSoSssSooci^jgsgs 

ICCK^TCICMMSI<:CCICC5M»CtO^S^§§|^*§§§«“ 

8101 GCraGTcSi^GC^AciS^ 

CGACGCACCCTCTGTCGTTCTGTGTGAGGTCAGXTAAGGACCGATCCGTTGTATTAGTAC 

8161 

8881 

92.1 fSS^SSSSSSSSSS^^ ' 

• CTTGGTGAACTAGATGGAGGTTAGTAAGTTTCTGAG 
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-3i9 cactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctag 
gtgaggtggtacttagtgaggggacaciccttgatgacagaagtgcgtctttcgcagatc 

-259 CCATGGCGTIAGTATCAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCATA 
GGIACCGCAATCATACTCACAGCACGTCGGAGGTCCTGGGGGGGAGGGCCCTCTCGGTAT 

-19 9 GTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCMGGA 
CACCAGACGCCTTGGCCACTCATGTGGCCTTAACGGTCCTGCTGGCCCAGGAAAGAACCT 

-139 tcaacccgctcaatgcctggagaottgggcgtgcccccgcaagactgctagccgagtagt 
agttgggcgagitacggacctctaaacccgcacgggggcgttctgacgatcggctcatca 


- 79 


GTTGGGTCGCGAAAGGCCTTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAG 

CAACCCAGCGCTITCCGGAACACCATGACGGACTATCCCACGAACGCTCACGGGGCCCTC 


- 19 gtctcgtagaccgtgcacc 
CAGAGCATCTGGCACGTGG 

Arg Thr 

MetSerThrAsnProLysProGlnLysLysAsnLysArgAsnThrAsnArgArgProGln 

1 atcagcacgaatcctaaacctcaaaaaaaaaacaaacg taacaccaaccgtcgcccacag 
tactcgtgcttaggatttggagttttttttttgtttgcattgtggttggcagcgggxgtc 

61 ^CGTCAAGTTCCCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGTCCAGG 

ctgcagttcaagggcccaccgccagtctagcaaccacctcaaatgaacaacggcgcgtcc 

GlvProArgLeuGlyValArgAlaThrArgLyeThrSerGluArgSerGlnProArgGly 
121 GGCCCTAGATTGGGTGTCCGCGCGACGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGI 

ccgggatctaacccacacgcgcgcigctctttctgaaggctcgccagcgttggagctcca 

AraArqGlnProlleProLysAlaArgArgProGluGlyArgThrTrpAlaGlnProGly 

181 agacgtcagcctatccccaaggctcgtcggcccgaggg caggacctgggctcagcccggg 
tctgcagtcggataggggmccgagcagccgggctcccgtcctggacccgagtcgggccc 

TvrProTrpProLeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerPro 

24 1 TACCCTTGGCCCCTCTATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCC 

atgggaaccggggagataccgtiactcccgacgcccacccgccciaccgaggacagaggg 

AraGlvSerArgProSerTrpGlyProThrAflpProArgArgArgSerArgAsnLeuGly 
301 CGTGGCTCTCGGCCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGT 
GCACCGAGAGCCGGATCGACCCCGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCA 

LvsVallleAspThrLeuThrCyaGlyPhaAlaAspLeuMetGlyTyrlleProLeuVal 

36i aaggtcatcgatacccttacgtgcggcttcgccgacctcatggggtacataccgctcgtc 

TTCCAGTAGCTATGGGAATGCACGCCGAAGCGGCK3GAGTACCCCATGTATGGCGAGCAG 

GlvAlaProLeuGlyGlyAlaAlaArgAlaLeuAlaHisGlyValArgValLeuGluAsp 

421 GGCGCCCCTCTTGGAGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCGGG ttctggaagac 

CCGCGGGGAGAACCTCCGCGACGGTCCCGGGACCGCGTACCGCAGGCCCAAGACCTTCTG 

Thr 

CCGCACTTGATACGTTGTCCCTTGGAAGGACCAACGAGAAAGAGATAGAAGGAAGACCGG 

GACGAGAGAACG^CTGACACGGGCGAAGCCGGATGGTTCACGCGTTGAGGTGCCCCGAA 

atggtgcagtggttactaacgggattgagctcaiaacacatgctccgccggctacggtag 

rrrrarACTCCGGGGTGCGTCCCTTGCGTTMTGAGGGCAACGCCTCGMGTGTTGGGTG 

gacgtgtgaggccccacgcagggaacgcaagcactcccgttgcggagctccacaacccac 


Fig 


601 


17-1 


661 
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AlaMetThrProThrValAlaThrArgAspGlyiysLeuProAlaThrGlnLeuArgArg 
721 GCGATGACCCCTACGGTGGCCACCAGGGATGGCAAACTCCCCGCGACGCAGCTTGGACGT 
CGCIACTGGGGATGCCACCGGTGGTCCCTACCGTTTGAGGGGCGCTGCGTCGAAGCTGCA 

HisIleAflpLeuLeuValGlySarAlaThrLeuCyaSexAlaLeuTyr'ValGlyAspLau 
781 CACATCGATCTGCTTGTCGGGAGCGCCACCCTCTGTTCGGCCCTCTACGTGGGGGACCIA 
GTGTAGCTAGACGAACAGCCCTCGCGGTGGGAGACAAGCCGGGAGATGCACCCCCTGGAT 

CyaGlySerValPheLeuValGlyGlnLeuPhQThrPheSerProArgArgHlsTrpThr 
841 TGCGGGTCTGTCTTTCTTGTCGGCCAACTGTTCACCTTCTCTCCCAGGCGCCACTGGAGG 
ACGCCCAGACAGAAAGAACAGCCGGTTGACAAGTGGAAGAGAGGGTCCGCGGTGACCTGC 

ThrGlnGlyCysAsnCysSerlloTyrProGlyHiallaThrGlyHisArgMetAlaTrp 
901 ACGCAAGGTTGCAATTGCTCTATCTAXCCCGGCCATATAACGGGTCACCGCATGGCATGG 
TGCGTTCCAACGTTAACGAGAIAGATAGGGCCGGTATATTGCCCAGTGGCGTACCGTACC 

Val 

AspMetMetMetAsnTrpSexProThxThrAlaLauValMetAlaGliiLeuLeuArglle 
961 GATATGATGATGAACTGGTCCCCTACGACGGCGTTGGTAATGGCTCAGCTGCTCCGGATC 
CTATACTACTACTTGACCAGGGGATGCTGCCGCAACCATTACCGAGTCGACGAGGCCTAG 

ProGlnAlallaLeuAapMetllaAlaGlyAlaHiaTrpGlyValXeuAlaGlylleAla 
1021 CCACAAGCCATCTTGGACATGATCGCTGGTGCTCACTGGGGAGTCCTGGCGGGCATAGCG 
GGTGTTCGGTAGAACCTGTACTAGCGACCACGAGTGACCCCTCAGGACCGCCCGTATCGC 

TyrPheSerMatValGlyAsnTrpAlaLysValLeuValValLeuLeuI.euPheAlaGly 
1081 TATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGTAGTGCTGCTGCTATTTGCCGGC 
ATAAAGAGGTACCACCCCTTGACCCGCTTCCAGGACCATCACGACGACGATAAACGGCCG 

ValAapAlaGluThrHiaValThrGlyGlySerAlaGlyHlaThrValSerGlyPhaVal 
1141 GTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGGCCACACTGTGTCTGGATTXGTT Fig. 17-2 
CAGCTGCGCCTTTGGGTGCAGTGGCCCCCTTCACGGCCGGTGTGACACAGACCIAAACAA 

SerLeuLeuAlaProGlyAlaLysGlnAsnValGlnLeuIleAsnThrAsnGlySerTrp 
1201 AGCCTCCTCGCACCAGGCGCCAAGCAGAACGTCCAGCTGATCAACACCAACGGCAGTTGG 
TCGGAGGAGCGTGGTCCGCGGTXCGTCTTGCAGGTCGACTAGTTGTGGTTGCCGTCAACC 

Hi aLeuAsr SerThrAl aLeuAanCy 3AsnAspS erLeuAB n ThrG lyTr pLeuAlaGly 
1261 CACCTCAAXAGCACGGCCCTGAACTGCAATGAIAGCCTCAACACCGGCTGGTTGGCAGGG 
GTGGAGTTATCGTGCCGGGACTTGACGTTACTATCGGAGTTGTGGCCGACCAACCGTCCC 

I 

LeuPheTyrHisHisLysPheAanSorSerGlyCysProGluArgEeuAlaSerCyaArg 
1321 CTTTTCTATCACCACAAGTTCAACTCXTCAGGCTGTCCTGAGAGGCTAGCCAGCTGCCGA 
GAAAAGAIAGTGGTGTTCAAGTTGAGAAGTCCGACAGGACTCTCCGATCGGTCGACGGCT 

ProLeuThrAspPheAspGlnGlyTrpGlyProlleSarTyrAlaAsnGlySerGlyPro 

1381 ccccttaccgattttgaccagggcxggggccctatcagttatgccaacggaagcggcccc 

GGGGAATGGCIAAAACTGGTCCCGACCCCGGGATAGTCAATACGGTTGCCTTCGCCGGGG 

AspGlnArgProTyrCyarrpHisTyxProProLysProCysGlylleValProAlaLys 
1441 GACCAGCGCCCCTACTGCTGGCACTACCCCCCAAAACCTTGCGGXATTGTGCCGGCGAAG 
CTGGTCGCGGGGATGACGACCGTGATGGGGGGTTTTCGAACGCCAIAACACGGGCGCTTC 

SerValCysGlyProValTyrCysPheThrProSsrProValValValGlylhrThrAsp 
1501 AGTGTGTGTGGICCGGTATATTGCTTCACICCCAGCCCCGTGGTGGTGGGAACGACCGAC 
TCACACACACCAGGCCATATAACGAAGTGAGGGTCGGGGCACCACCACCCTTGCTGGCTG 


ArgSerGlyAlaProXhrTyrSarTrpGlyGluAanAapThrAapValPheValLeuAan 
1561 AGGTCGGGCGCGCCCACCTACAGCTGGGGTGAAAATGATACGGACGTCTTCGTCCTTAAC 
TCCAGCCCGCGCGGGTGGATGTCGACCCCACTTTTACTAXGCCTGCAGAAGCAGGAATXG 

AsnThrArgProPrdLeuGlyAsnTrpPheGlyCysThrTrpMetAsnSBrThrGlyPhe 
1621 AATACCAGGCCACCGCTGGGCAATTGGTTCGGTTGTACCTGGATGAACTCAACTGGATTC 
TTATGGTCCGGTGGCGACCCGTTAACCAAGCCAACATGGACCTACTTGAGTTGACCTAAG 
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ThrLysValCyaGlyAlaProProCyaVallleGlyGlyAlaGlyAsnAsnThrLeuHis 
1681 ACCAAAGTGTGCGGAGCGCCTCCTTGTGTCATCGGAGGGGCGGGCAACAACACCCTGCAC 
TGG TTTCACACGCC TCGCGGAG GAACACAG TAG CCTCCCCGCCCGTTG TTG TGGGACG TG 

CysProThrAapCyaPheArgLysHiaPrcsAspAlaThrTyrSexArgCysGlySerGly 
1741 TGCCCCACTGATTGCTTCCGCAAGCATCCGGACGCCACATACTCTCGGTGCGGCTCCGGT 

acggggtgactaacgaaggcgttcgtaggcctgcggtgtatgagagccacgccgaggcca 

leu 

ProTrplleThrProArgCyaLeuValAspTyrProTyrArgLeuTrpHlaTyrProCys 
1801 CCCTGGATCACACCCAGGTGCCTGGTCGACTACCCGTATAGGCTTTGGCATTATCCTTGT 
GGGACCTAGTGTGGGTCCACGGACCAGCTGATGGGCATATCCGAAACCGTAATAGGAACA 

ThrlleAsnTyrThrllePheLyaileArgMetTyiValGlyGlyValGluHisArgLeu 
1861 ACCATCAACTACACCATATTTAAAATCAGGATGTACGTGGGAGGGGTCGAACACAGGCTG 

tggtagttgatgtggtataaattttagtcctacatgcaccctccccagcttgtgtccgac 

GluAlaAlaCysAenTrpThrArgGlyGluAxgCysAspLeuGluAspArgAspArgSer 
1921 GAAGCTGCCTGCAACTGGACGCGGGGCGAACGTTGCGATCTGGAAGACAGGGACAGGTCC 
CTTCGACGGACGTTGACCTGCGCCCCGCTTGCAACGCTAGACCTTCTGTCCCTGTCCAGG 

GluLeuSerPrdLeuLeuLeuThrThrThxGlnTrpGlnValleuProCysSerPheThr 
1981 GAGCTCAGCCCGTTACTGCTGACCACTACACAGTGGCAGGTCCTCCCGTGTTCCTTCACA 
CTCGAGTCGGGCAATGACGACTGGTGATGTGTCACCGTCCAGGAGGGCACAAGGAAGTGT 

ThrLauProAlaLeuSerThrGlyLeulleHisLeuHisGlnAsnlleValAspValGln 

2041 accctaccagccttgtccaccggcctcatccacctccaccagaacattgtggacgtgcag 
tgggatggtcggaacaggtggccggagtaggtggaggtggtcttgtaacacctgcacgtc 

TyrLeuTyrGlyvalGlySexSerllaAlaSerTrpAlalleLysTrpGluryrValVal 
2101 TACTTGTACGGGGTGGGGTCAAGCATCGCGTCCTGGGCCATTAAGTGGGAGTACGTCGTT Pig, 17-3 
ATGAACATGCCCCACCCCAGTTCGTAGCGCAGGACCCGGTAATTCACCCTCATGCAGCAA 

LeuLeuPheLeuLeuLeuAlaAspAlaArgValCyaSerCysLeuTrpMetMetLeuLeu 
2161 CTCCTGTTCCTTCTGCTTGCAGACGCGCGCGTCTGCTCCTGCTTGTGGATGATGCTACTC 
GAGGACAAGGAAGACGAACGTCTGCGCGCGCAGACGAGGACGAACACCTACTACGATGAG 

IleSsrGlnAlaGluAlaAlaLeuGluAsnLeuVallleLeuAsnAlaAlaSerLeuAla 
2221 ATATCCCAAGCGGAGGCGGCTTTGGAGAACCTCGTAATAC1TAATGCAGCATCCCTGGCC 

iatagggttcgcctccgccgaaacctcttggagcattatgaattacgtcgtagggaccgg 

* 

GlyThrHisGlyLeuValSerPheleuValPhePheCysPheAlaTrpTyrleuLysGly 
2281 GGGACGCACGGTCTTGTATCCTTCCTCGTGTTCTTCTGCTTTGCATGGTATTTG AAGGGT 
CCCTGCGTGCCAGAACATAGGAAGGAGCACAAGAAGACGAAACGTACCATAAACTTCCCA 

LysTrpValProGlyAlaValTyrThrPheTyrGlyMetTrpProLeuLeuLe u LeuLeu 
2341 aagtgggtgcccggagcggtctacaccttctacgggatgtggcctctcctcctgctcctg 
ttcacccacgggcctcgccagatgtggaagatgccctacaccggagaggaggacgaggac 

LeuAlaLeuProGlnArgAlaTyrAlaLeuAepThrGluValAlaAlaSerCysGlyGly 
2401 TTGGCGTTGCCCCAGCGGGCGTACGCGCTGGACACGGAGGTGGCCGCGTCGTGTGGCGGT 
AACCGCAACGGGGTCGCCCGCATGCGCGACCTGTGCCTCCACCGGCGCAGCACACCGCCA 

ValValLeuValGlyLeuMetAlaLeuThrLeuSerProTyrTyxLysArgTyrlleSer 
2461 GTTGTTCTCGTCGGGTTGATGGCGCTGACTCTGTCACCATATTACAAGCGCTATATCAGC 
CAACAAGAGCAGCCCAACTACCGCGACTGAGACAGTGGTATAATGTTCGCGATATAGTCG 

Asd 

TrpCyeLeuTrpTrpLeuGlnTyrPheLeuThrArgValGluAlaGlnLeuHisValTrp 
2521 TGGTGCTTGTGGTGGCTTCAGTATTTTCTG ACCAGAGTGGAAGCGCAACTGCACGTGTGG 

ACCACGAACACCACCGAAGTCATAAAAGACTGGTCTCACCTTCGCGTTGACGTGCACACC 

IleProProLeuAsnValArgGlyGlyAxgAspAlaVallleLeuLeuMetCyeAlaVal 
2581 ATTCCCCCCCTCAACGTCCGAGGGGGGCGCGACGCCGTCATCTTACTCATGTGTGCTGTA 

taagggggggagttgcaggctccccccgcgctgcggcagtagaatgagtacacacgacat 
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HiaProThrLeuValPheAapileThrLysLeuLeuLeuAlaValPheGlyProI/suTrD 

2641 cacccgactctggtattxgacatcaccaaattgctgctggccgtcttcggacccctttgg 
gtgggctgagaccataaactgtagtggtttaacgacgaccggcagaagcctggggaaacc 


2701 


IleLeuGlnAlaSerLeuLeuLysValProTyxPhaValArgValGlnGlyLeuLeuAra 

attcttcaagccagtttgcttaaagtaccctactttgtgcgcgtccaaggccttctccgg 

taagaagttcggtcaaacgaatttcatgggatgaaacacgcgcagg ttccggaagaggcc 


2761 


PheCysAlaLeuAlaArgLysMetlleGlyGlyHisTyrValGlnMetVallleileLya 

ttctgcgcgttagcgcggaagatgatcggaggccattacgtgcaaatggtcatcattaag 

aagacgcgcaatcgcgccttctactagcctccggtaatgcacgtttaccagtagtaattc 


LeuGlyAlaLeuThrGlyThrTyrValTyrAsnHlaLeuThrProLeuArgAspTrpAla 
2821 TTAGGGGCGCTTACTGGCACCTATGTTTATAACCATCTCACTCCTCTTCGGGACTGGGCG 

aatccccgcgaatgaccgtggatacaaatattggtagagtgaggagaagccctgacccgc 

HlsAsnGlyLeuArgAspLeuAlaValAlaValGluProValValPheSerGlnMetGlu 
2881 CACAACGGCTTGCGAGATCTGGCCGTGGCTGTAGAGCCAGTCGTCTTCTCCCAAATGGAG 
GTGTTGCCGAACGCTCTAGACCGGCACCGACATCTCGGTCAGCAGAAGAGGGTTTACCTC 


ThrLysLeuileThrTrpGlyAlaAapThrAlaAlaCysGlyAflpllelleAsnGlyLeu 
2941 ACCAAGCTCATCACGTGGGGGGCAGATACCGCCGCGTGCGGTGACATCATCAACGGCTTG 
TGGTTCGAGTAGTGCACCCCCCGTCTATGGCGGCGCACGCCACTGTAGTAGTTGCCGAAC 

ProValSerAlaArgArgGlyArgGluIleLeuLeuGlyProAlaAspGlyMetValSer 
3001 CCTGTTTCCGCCCGCAGGGGCCGGGAGATACTGCTCGGGCCAGCCGATGGAATGGTCTCC 
GGACAAAGGCGGGCGTCCCCGGCCCTCTATGACGAGCCCGGTGGGCTACCTTACCAGAGG 


, iiysGlyTrpArgLeuLeuAlaProIlaThrAlaTyrAlaGlnGlnThrArgGlyLeuLeu 
3061 AAGGGGTGGAGGTTGCTGGCGCCCATCACGGCGTACGCCCAGCAGACAAGGGGCCTCCTA 

ttccccacctccaacgaccgcgggtagtgccgcatgcgggtcgtctgttccccggaggat 


GlyCysIlelleThrSerleuThrGlyArgAspLysAsnGlnValGluGlyGluValGln 

3 121 GGGTGCATAATCACCAGCCTAACTGGCCGGGACAAAAACCAAGTGGAGGGTGAGGTCCAG 
CCCACGTATTAGTGGTCGGATTGACCGGCCCTGTTTTTGGTTCACCTCCCACTCCAGGTC 


lleValSerThrAlaAlaGlnThrPhaLeuAlaThrCyallaAsnGlyValCysTrpThr 
3181 ATTGTGTCAACTGCTGCCCAAACCTTCCIGGCAACGTGCATCAATGGGGXGTGCTGGACT Fiq ' 17 
TAACACAGTTGACGACGGGT1TGGAAGGACCGTTGCACGTAGTTACCCCACACGACCTGA 


3241 


ValTyrHisGlyAlaGlyThrArgThrlleAlaSerProLysGlyProVallleGlnMet 

GTCTACCACGGGGCCGGAACGAGGACCATCGCGTCACCCAAGGGTCCTGTCATCCAGATG 

CAGATGGTGCCCCGGCCTTGCTCCTGGTAGCGCAGTGGGTICCCAGGACAGTAGGTCEAC 


3301 


Ser Thr 

TyrThrAsnV alAspGlnAflpLauValGlyTrpPrcAl aProGlnGlySfixAr? S ^r Lcu 
TATACCAATGTAGACCAAGACCTTGTGGGCTGGCCCGCTCCGCAAGGTAGCCGCTCATTG 
ATATGGTTACATCTGGTTCTGGAACACCCGACCGGGCGAGGCGTTCCATCGGCGAGTAAC 


ThrProCysThrCysGlySorSerAspLeuTyrLeuValThrArgHisAlaAfipVallle 
3361 ACACCCTGCACTTGCGGCTCCTCGGACCTTTACCTGGTCACGAGGCACGCCGATGTCATT 

tgtgggacgtgaacgccgaggagcctggaaatggaccagtgctccgtgcggctacagtaa 


ProValArgArgArgGlyAspSerArgGlySerLeuLeuSerProArgProIleSexTyr 

3421 cccgtgcgccggcggggtgatagcaggggcagcctgctgtcgccccggcccatttcctac 
gggcacgcggccgccccacxatcgtccccgtcggacgacagcggggccgggtaaaggatg 


LeuLysGlySerSerGlyGlyProLeuLeuCyePrcAlaGlyHisAlaValGlyllePhe 
3481 TTGAAAGGCTCCTCGGGGGGTCCGCTGTTGTGCCCCGCGGGGCACGCCGTGGGCATATTT 

aactttccgaggagccccccaggcgacaacacggggcgccccgtgcggcacccgtaiaaa 


ArgAlaAlaValCysThrArgGlyValAlaLysAlaValAspPhelleProValGluAsn 

3541 agggccgcggtgtgcacccgtggagtggctaaggcggxggactttatccctgtggagaac 
tcccggcgccacacgtgggcacctcaccgattccgccacctgaaatagggacacctcttg 
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3601 


1661 


3721 


3781 


3841 


3901 


3961 


4021 


4081 


4141 


4201 


4261 


LeuGluThrThrMetArgSerProValPheThrAapAsnSerSerProProValValPro 

ctagagacaaccatgaggtccccsgtcttcacggaxaactcctctccaccagtagtgccc 

GATCTCTGTTGGTACTCCAGGGGCCACAAGTGCCTATTGAGGAGAGGTGGTCATCACGGG 

GlnSerPheGlnValAlaHisLeuHisAlaProThrGlySerGlyLyaSerThrLysVal 

CAGAGCTTCCAGGTGGCTCACCTCCATGCTCCCACAGGCAGCGGCAAAAGCACCAAGGTC 

gtctcgaaggtccaccgagtggaggtacgagggtgtccgtcgccgttttcgtggttccag 

GGCCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTGGGGAGACAACGACGT 

Leu 

aS^ggctt^^tSSktcSg^tStgggat^icctaacatcaggacc 

TGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTAGGATTGIAGTCCTGG 

asasES^ 

ccccactcttgmaatggtgaccgtcggggtagtgcatgaggtggatgccgttcaaggaa 

cggctgcScccacgagccccccgcgaatacigtattattaaacactgctcacggtgagg 

tgcctacggtgtaggtagaacccgtagccgtgacaggaactggttcgtctctgacgcccc 

cgctctgaccaacacgagcggtggcggtggggaggcccgaggcagtgacacggggtaggg 

^catcgmgaggttcc^tgt^^c^magatcccttttta^caaggctatg 

ttgtagctcctccaacgagacaggtggtggcctctctagggaaaaatgccgttccgatag 

SSSgtSSggggg^a^ 

ggggagcttcattagttccccccctctgtagagtagaagacagtaagtttcttcttcacg 

ctgcttgagcggcgtttcgaccagcgtaacccgtagttacggcaccggatgatggcgcca 

gaactgcaSggcaSagggctggtcgccgctacaacagcagcaccgttggctacgggag 


Fig. 17- 


Tyx 


4321 


Ja^^gatatcgccgctgaagctgagccactatctgacgttatgcacacagtgggtc 


$er 


4381 


4441 


^SgSaSgtcggaactgggatggaagtggtaacxctgttagtgcgagggggtccta 


4501 
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TyrAspAlaGlyCyaAlaTrpTyrGluLauThrProAlaGluThrThrValArgLeuArg 

4561 tatgacgcaggctgtgcttggtatgagctcacgcccgccgagactacagttaggctacga 

ATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGATGTCAAICCGATGCT 

• 

AlaTyrMetAsnThrProGlyLeuProValCysGlnAapHisLeuGluPheTrpGluGly 
4621 G03TACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTTGAATTTTGGGAGGGC 
CGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGZAGAACTXAAAACCCTCCCG 

ValPheThrGlyLeuThxHialleAspAlaHisPheLeuSerGlnThrLysGlnSerGly 
4681 GTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAGACAAAGCAGAGTGGG 
CAGAAATG TCCGG AG TG AGTAT ATCT ACGGGTGAAAG ATAGGG TCTG TTTCGTC TCACCC 

GluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCysAlaArgAlaGlnAlaPro 
4741 GAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCTAGGGCTCAAGCCCCT 
CTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGATCCCGAGTTCGGGGA 

ProProSerTxpAspGlnMetTrpLysCysLeuIleArgLeuLysProThrLeuHisGly 
4801 CCCCCATCGTGGGACCAGATGrGGAAGTGTTTGATTCGCCTCAAGCCCACCCTCCATGGG 
GGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAGTTCGGGTGGGAGGTACCC 

ProThrProLeuLeuTyrArgLeuGlyAlaValGlnAsnGlulleThrLeuThrHisPro 
4861 CCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAATGAAATCACCCTGACGCACCCA 
GGTTGTGGGGACGATATGZCTGACCCGCGACAAGTCTTACTTIAGTGGGACTGCGTGGGT 

ValThrLysTyrlleMatThrCysMetSerAlaAspLeuGluValValThrSerThrTrp 
4921 GICACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTCGTCACGAGCACCTGG 
CAGTGGTTTATGTAGTACTGIACGTACAGCCGGCTGGACCTCCAGCAGTGCTCGTGGACC 

ValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeuSerThrGlyCysVal 
4981 GTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTGTCAACAGGCTGCGTG 
CACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGACAGTTGTCCGACGCAC 

Pig. 17-6 

VallleValGlyArgValValLeuSexGlyLysProAlallelleProAspArgGluVal * 

5041 GTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATACCTGACAGGGAAGTC 
CAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTATGGACTGTCCCTTCAG 

LeuTyrArgGluPheAspGluMetGlUGluCysSerGlnHisLauProTyrlleGluGln 
5101 GTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTACCGTACATCGAGCAA 
GAGATGGCTCTCAAGCTACTCTACCIICICACGAGAGTCGTGAATGGCATGTAGCTCGTT 

GlyMetMetLeuAlaGluGlnPheLyeGlnLysAlaLeuGlyLeuLeuGlnThrAlaSer 
5161 GGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTCCTGCAGACCGCGTCC 
CCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAGGACGTCTGGCGCAGG 

ArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGlnLysLeuGluThrPhe 

5221 cgtcaggcagaggttatcgcccctgctgtccagaccaactggcaaaaactcgagaccttc 
gcagtccgtctccaatagcggggacgacaggictggttgaccgtttttgagctctggaag 

TrpAlaLysHisMetTrpAsnPhelleSarGlylleGlnTyrLeuAlaGlyLeuSerThr 
5281 TGGGCGAAGGATATGTGGAACTTCATCAGTGGGAIACAATACTTGGCGGGCTTGTCAACG 

acccgcttcgtatacaccttgaagtagtcaccctatgttatgaaccgcccgaacagttgc 

LeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAlaAlaValThrSerPro 
5341 CTGCCTGGTAACCCCGCeATTGCTTCATTGATGGCTTTTACAGCTGCTGTCACCAGCCCA 
GACGGACCATTGGGGCGGTAACGAAG TAACTACCGAAAATGTCGACGACAG TGG TCGGGT 

LeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGlyTrpValAlaAlaGlnLeu 
5401 CTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGGGTGGCTGCCCAGCTC 
GATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACCCACCGACGGGTCGAG 

AlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGlyLeuAlaGlyAlaAlalleGly 
5461 GCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCTGGCGCCGCCATCGGC 
CGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGACCGCGGCGGTAGCCG 

SerValGlyLeuGlyLysValLeuIlaAspIleLeuAlaGlyTyrGlyAlaGlyValAla 
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5521 


,581 


5641 


5701 


5761 


5821 


5881 


5941 


6001 


6061 


6121 


6181 


6241 


6301 


6361 


6421 


AGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTATGGCGCGGGCGTGGCG 

TCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATACCGCGCCCGCACCGC 

I 

Gly 

GlvAlaLeuValAlaPheLyslleMetSerGlyGluValProSerThrGluAspLeuVal 

ggagctcttgtggcattcaagatcatgagcggtgaggtcccctccacggaggacctggtc 

CCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGGTGCCTCCTGGACCAG 

AanLeuLeuProAlalleLeuSerProGlyAlaLeuValVolGlyValValCysAlaAla 

AAXCXACTGCCCGCCAXCCXCTCGCCCGGAGCCCXCGXAGXCGGCGXGGXCXGTGCAGCA 

ttagatgacgggcggtaggagagcgggcctcgggagcatcagccgcaccagacacgtcgt 

IlftLeuAraArcHiaValGlyProGlyGluGlyAlaValGlnTrpMetAsriArgLeuIle 

ATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGGATGAACCGGCTGATA 

xaxgacgcggccgxgcaaccgggcccgcxcccccgxcacgxcacctacxxggccgacxax 

AlaPheAlaSerAxgGlyAsnHisValSerProThrHisTyrValProGluSarAspAla 

gccticgcctcccgggggaaccaxgxxtcccccacgcacxacgxgccggagagcgatgca 

CGGAAGCGGAGGGCCCCCTXGGXACAAAGGGGGTGCGTGAXGCACGGCCXCTCGCXACGX 

HisCys m „ 

AlaAlaArctValXhrAlallalieuSarSerLauThrValXhrGlnLeuLeuArgArgLeu 

GCXGCCCGCGTCACXGCCAXACXCAGCAGCCXCACXGXAACCCAGCTCCXGAGGCGACXG 

CGACGGGCGCAGXGACGGXATGAGXCGTCGGAGTGACAIIGGGXCGAGGACXCCGCXGAC 

HisGlnXrpIleSerSerGluCysXhrXhrProCysSerGlySerXrpLeuArgAsplle 

CACCAGXGGAXAAGCXCGGAGTGXACCACXCCAXGCXCCGGXXCCXGGCXAAGGGACATC 

GTGGTCACCXATXCGAGCCTCACAXGGXGAGGXACGAGGCCAAGGACCGAITCCCTGXAG 

accctgacctatacgctccacaactcgctgaaattctggaccgattttcgattcgagtac 

ProGlnLeuProGlylleProPheValSorCysGlnArgGlyTyrLysGlyValTrpArg 

CCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTAXAAGGGGGTCTGGCGA 

GGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATATTCCCCCAGACCGCT 

ValAspGlYlleMetHisThrArgCysHisCysGlyAlaGluIleThrGlyHisValLys 

GTGGACGGCAICATGCACACTCGCTGCCACTGTGGAGCTGAGATCACTGGACATGTCAAA 

CACCTGCCGTAGIACGTGTGAGCGACGGTGACACCTCGACTCTAGTGACCTGTACAGTTT 

TTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTACACCTCACCCTGGAAG 

GGGIAAITACGGATGXGGXGCCCGGGGACATGGGGGGAAGGACGCGGCXXGAXGXGCAAG 

CGCGATACCTCCCACAGACGTCTCCTTAIACACCTCTATTCCGTCCACCCCCTGAAGGTG 

atgcactgcccatactgatgactgttagagttxacgggcacggtccagggtagcgggctt 

aaaaagtgtcttaacctgccccacgcggatgtatccaaacgcggggggacgttcgggaac 

S?^“STisS^™xGAGGrecmiGGiKcxiccca«Km»T 


Fio. 17- 
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ProCysGluProGluPrcAspValAlaValLeuThrSerMetLeuThrAspProSerHis 

6481 ccttgcgagcccgaaccggacgtc-gccgtgttgacgtccatgctcactgatccctcccat 

GGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAGTGACTAGGGAGGGIA 


IleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySerProProSerValAlaSer 
541 ATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGAICACCCCCCTCTGTGGCCAGC 

tattgtcgtciccgccggcccgcttccaaccgctcccctagtggggggagacaccggtcg 

SerSerAlaSarGlnLeuSerAlaFroSarLeuLysAlaThrCysThrAlaAsnHisAsp 
6601 TCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAACTTGCACCGCTAACCATGAC 

aggagccgatcggtcgataggcgaggtagagagttccgttgaacgtggcgattggtactg 

SerProAspAlaGluLeuIleGluAlaAanLeuLeuIrpArgGlnGluMetGlyGlyAsn 
6661 TCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAGGAGATGGGCGGCAAC 
AGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCCGTCCTCTACCCGCCGTTG 

ileThrArgValGluSerGluAsnLysValVallleLsuAapSerPheAspProLeuVal 
6721 ATCACCAGGGTTGAGTCAGAAAACAAAGTGGTCATTCTGGACTCCTTCGATCCGCTTGTG 

tagtggtcccaactcagtcitttgtttcaccactaagacctgaggaagctaggcgaacac 

AlaGluGluAspGluArgGlulleSeiValProAlaGlulleLeuArgLysSerArgArg 
6781 GCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCCTGCGGAAGTCTCGGAGA 
CGCCTCCTCCTGCTCGCCCICTAGAGGCATGGGCGTCTTTAGGACGCCTTCAGAGCCTCT 

PheAlaGlnAlaLeuProValTrpAlaArgProAspTyrAsnProProLeuValGluThr 
6841 TTCGCCCAGGCCCXGCCCGTTTGGGCGCGGCCGGACTATAACCCCCCGCTAGTGGAGACG 
AAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGGGGCGATCACCTCTGC 


TrpLysLysProAspTyrGluProProValValHlflGlyCysProLeuProProProLys 
6901 TGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGTCCGCTTCCACCTCCAAAG 
ACCXTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACAGGCGAAGGTGGAGGTTTC 


6961 


7021 


SerProProValProProProArgliysLyBArgThsValValLeuThrGluSerThrLeu 

TCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTCACTGAATCAACCCXA 

AGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAGTGACTTAGTTGGGAT 

S€X 

SerThrAlaLeuAlaGluL-euAlaThrArgSerPheGlySerSerSerThrSerGlylle 

TCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCCTCAACTXCCGGCATT 

AGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGGAGTTGAAGGCCGTAA 


Pig. 17-8 


ThrGlyAspAanThrThxThrSerSerGluProAlaProSerGlyCysProProAepSer 
7081 ACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGCTGCCCCCCCGACTCC 
TGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCGXCGGGGGGGCTGAGG 


7141 


PheAla 

AspAlaGluSerTyrSerSeridetProProLeuGluGlyGluProGlyAspProAspLeu 

GACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCTGGGGATCCGGATCXT 

CTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGACCCCTAGGCCTAGAA 


7201 


7261 

7321 


SexAspGlySerTrpSexThrValSerSerGluAlaAsnAlaGluAspValValCysCys 

AGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAGGATGTCGTGTGCTGC 

TCGCTGCCCAGTACCAGXTGCCAGTCATCACTCCGGTTGCGCCTCCIACAGCACACGACG 


SerMetSerTyxSerTrpThrGlyAlaLeuValThrProCysAlaAlaGluGluGlnLys 

TCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCCGCGGAAGAACAGAAA 

AGTIACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGGCGCCTTCTTGXCTTT 


IiEuProTlPA^rA 1 pnff | rT T ^nT^iiArgHiaRiaAenLeuValTyrSarThr 

CTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCACCACAATTTGGTGTATTCCACC 

GACGGG TAGTTACGTG ATTCG TTGAGCAACGATGCAG TGGTGTTAAACCACATAAG G TGG 


ThrSerArgSexAlaCysGlnArgGlnLysLysValThrPheAspArgLeuGlnValXeu 

7381 ACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGACAGACTGCAAGTTCTG 
TGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTGTCTGACGTTCAAGAC 





EP 0 388 232 A1 


AspSerHisTyrGlnAspVaHieuLysGluValLysAlaAlaAlaSerLysValLysAla 
7441 GACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCGGCGXCAAAAGTGAAGGCT 
CTGTCGGIAATGGTCCTGCATGAGTTCCTCCAAITTCGTCGCCGCAGTTITCACTTCCGA 


Phe 

AsnLeuLeuSerValGluGluAlaCysSerLeuThrProProHiaSerAlaLysSsrLys 
7501 AACTTGCTAXCCGXAGAGGAAGCTTGCAGCCIGACGCCCCCACACICAGCCAAATCCAAG 

ttgaacgataggcaictccttcgaacgtcggacigcgggggigtgagtcggttxaggttc 

PheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLyaAlaValThrHialleAsn 
7561 tttggttatggggcaaaagacgtccgttgccatgccagaaaggccgtaacccacatcaac 
aaaccaataccccgitxicigcaggcaacggxacggtctxtccggcaiigggigxagitg 

SexValTrpLysAspLeuLcuGluAspAsnVallhrProIleAspThrThrlleMetAla 
7621 TCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAAIAGACACTACCAXCATGGCT 
AGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTIAICTGXGATGGTAGTACCGA 

LyaAsnGluValPheCysValGlnProGluLyaGlyGlyArgLyaProAlaArgLeuIle 
7681 AAGAACGAGGTTXTCTGCGXXCAGCCXGAGAAGGGGGGICGTAAGCCAGCTCGTCTCAXC 
XTCTIGCICCAAAAGACGCAAGICGGACTCIXCCCCCCAGCAITCGGXCGAGCAGAGTAG 

ValPheProAspLeuGlyValArgValCysGluLysMetAlaLeuTyrAspValValThr 
7741 GIGITCCCCGAICIGGGCGTGCGCGIGIGCGAAAAGATGGCXTTGXACGACGTGGTXACA 
CACAAGGGGCTAGACCCGCACGCGCACACGCTTXXCTACCGAAACATGCTGCACCAATGT 

LysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyrSerProGlyGlnArg 
7801 AAGCTCCCCTTGGCCGTGAXGGGAAGCICCTACGGATICCAAIACICACCAGGACAGCGG 

ttcgaggggaaccggcactacccttcgaggaigccxaaggttaigagiggiccigtcgcc 

ValGluPheLeuValGinAlaTrpLysSerLysLyeihrProMetGlyPheSerXyrAsp 
7861 GXTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATGGGGXTCXCGTATGAT 

caacitaaggagcacgixcgcaccitcaggixciixtggggttaccccaagagcatacxa 

ThrArgCysPheAspSerThrValThrGluSerAspXleArgXhrGluGluAlalleTyr 17 

7921 acccgctgcxxxgactccacagicacxgagagcgacatccgtacggaggaggcaaiciac 
tgggcgacgaaactgaggtgtcagtgactcxcgctgtaggcatgccxcctccgtxagaig 

GlnCysCyaAspLauAspProGlnAlaArgValAlalleLysSerLeuThrGluArgLeu 
7981 CAATGIIGTG ACCICGACCCCCAAGCCCGCGTGGCCATCAAGICCCICACCGAGAGGCXX 
GTTACAACACTGGAGCTGGGGG$TCGG6CGCACCGGIAGTTCAGGGAGTGGCTCTCCGAA 


Gly 

TyrValGlyGlyProLeuThrAsnSexArgGlyGluAsnCysGlyTyrArgArgCysArg 
8041 IATGXXGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACIGCGGCIAICGCAGGTGCCGC 
ATACAACCCCCGGGAGAAIGGTTAAGXICCCCCCTCTTGACGCCGAIAGCGTCCACGGCG 


AlaSerGlyValLeuxhrThrSerCysGlyAsnThrLeuXhxCysTyrlleLyaAlaArg 
8101 GCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCXCACITGCTACATCAAGGCCCGG 
CGCTCGCCGCATGACTGTTGATCGACACCAIXGIGGGAGTGAACGATGTAGTTCCGGGCC 


8161 


AlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeuValCyaGlyAspAspLeu 

GCAGCCTGTCGAGCCGCAGGGCXCCAGGACIGCACCATGCXCGTGXGTGGCGACGACTTA 

CGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCACACACCGCTGCXGAAX 


ValVallleCysGluSerAlaGlyValGlnGluAspAlaAlaSerLeuArgAlaPheThr 

8221 gtcgttatctgtgaaagcgcgggggtccaggaggacgcggcgagcctgagagccttcacg 
cagcaatagacactttcgcgcccccaggtcctcctgcgccgctcggactctcggaagtgc 


GluAlaHetThrArgTyrSerAlaProProGlyABpProProGlnProGluTyrAspLeu 

8201 gaggctatgaccaggtactccgccccccctggggaccccccacaaccagaatacgacttg 
ctccgatactggiccatgaggcgggggggacccctggggggtgttggtctiatgctgaac 


GluLeuIleThrSerCysSerSerAsnValSerValAlaHisAspGlyAlaGlyLysArg 

gagctcataacatcatgctcctccaacgtgtcagtcgcccacgacggcgctggaaagagg 

CTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTGCCGCGACCTTTCTCC 


8341 
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ValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAlaLAlaTrpGluThrAla 
8401 GTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCTGCGTGGGAGACAGCA 

cagatgatggagtgggcactgggatgttggggggagcgctctcgacgcaccctctgtcgt 

ArgHisThrProValAenSerTrpLeuGlyAsnllalleMetPheAlaProThrLeuTrp 
8461 agacagactccagtcaattcctggctaggcaacataatcatgtttgcccccacactgtgg 

tctgtgtgaggtcagttaaggaccgatccgttgtattagtacaaacgggggtgtgacacc 

* 

AlaArgMfttlleLeuMetThrHisPhePheSerValLeulleAlaArgAspGlnLeuGlu 

8521 gcgaggatgatactgatgacccatttctttagcgtccttatagccagggaccagcttgaa 

CGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGGTCCCTGGTCGAACTT 

GlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSerlleGluProLeuAapLeuPro 
8581 CAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCATAGAACCACTTGATCTACCT 
GTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTTGGTGAACTAGATGGA 


ProIlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuHisSerTyrSerProGly 
8641 CCAATCATTCAAAGACTGCATGGCCTCAGCGGATTTTCACTCCACAGTTACTCTCCAGGT 
GGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTGTCAATGAGAGGTCCA 

GluIleAsnArgValAlaAlaCysLeuArgLysLeuGlyValProProLeuArgAlaTrp 
8701 GAAATTAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCGCCCTTGCGAGCTTGG 

ctttaattatcccaccggcgtacggagtcttttgaaccccatggcgggaacgctcgaacc 

Gly 

ArgHisArgAlaArgSerValArgAlaArgLeuLeuAlaArgGlyGlyArgAlaAlalle 
8761 AGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGAGGCAGGGCTGCCATA 
TCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCTCCGTCCCGACGGTAT 

CysGlyLysTyrLeuPheAsnTrpAlaValArgThrLyaLeuLya 

8821 tgtggcaagtacctcttgaactgggcagtaagaacaaagctcaaac 

ACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTTG Pig. 17-10 
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